Threads for safinaskar

    1. 2

      I sent my thoughts to the author ( https://github.com/radarroark/xit/issues/9 ). I will reproduce them here


      Thanks for this project!

      Some ideas (note that I merely have read blog post and didn’t dig futher):

      • This may be good idea to fully replicate git’s CLI. At least as an option. This will help spreading the project
      • Migrate away from SHA1. It is broken. It is one very unfortunate git’s design mistake. Also, you should change hashes regularly anyway: https://valerieaurora.org/hash.html . (Well, actual migrating from SHA1 will likely break github compatibility, so, of course, it makes sense to support SHA1 for now. But please support other hashes, too. Don’t repeat git’s mistake: git simply hardcoded SHA1 everywhere originally.)
      • In the past I spent a lot of time researching CDC-and-deduplication. My findings are here: https://github.com/borgbackup/borg/issues/7674 . Short overview of FOSS solutions is here: https://lobste.rs/s/0itosu/look_at_rapidcdc_quickcdc#c_ygqxsl . In short, existing solutions are under-optimized, and there is a lot of low handling fruit here. I was able very easily create very small program in Rust, which beats existing deduplication solutions by wide margin (but my program doesn’t use CDC). So I suggest reading my ideas and comparing speed of your solution with other solutions
      • Patch-based merging seems to be killer feature (assuming it works well). So, I suggest making it main ad strategy. Linux devs often maintain their patchsets as series of patch files, not as git branches, exactly because git merging doesn’t work well. So, reach Linux devs and tell them about your tool. In particular, person number 2 in Linux, Greg KH, maintainer of stable Linux trees, stores his stable trees as series of patch files in git (aaaah!). Here he describes his workflow: http://www.kroah.com/log/blog/2019/08/14/patch-workflow-with-mutt-2019/ . Key parts are these: “The stable kernel tree, while under development, is kept as a series of patches that need to be applied to the previous release. This series of patches is maintained by using a tool called (quilt)… Anyway, the stable patches are kept in a quilt series in a repository that is kept under version control in git (complex, yeah, sorry.) That queue can always be found (here)”. Same applies to a lot of Debian packages. For example, gcc (and lots of other Debian packages) is, again, maintained as patches-stored-in-git. See here https://salsa.debian.org/toolchain-team/gcc/-/tree/gcc-14-debian/debian/patches . I think this is, again, because of git merge and git rebase problems. So, spread your xit as tool to solve all these problems. Of course, it helps if you are CLI-compatible with git
      • “If the first byte is 0, it is uncompressed; if it is 1, it is zlib-compressed”. I suggest moving to zstd, it is better in every way (faster and smaller). Also, zstd may be good in compressing binary files (at least I hope zstd doesn’t do them sufficiently larger). “While xit has compression support, it currently disables it even for text files”. Try zstd -0, it is fast enough, while giving substantial compression for text files. If it is too slow, try lz4, it is even faster
      • “Want to find the descendent(s) of a commit? Uhhh…well, you can’t”. As pointed out on lobsters, you can see descendants: https://lobste.rs/s/mltpfg/xit_is_coming#c_cnwsps . (But I understand your point, i. e. you argue that we need separate data structure for this)

      Feel free to ask any questions.

      Also: even if you implement all these, I still do not plan to use xit. (I’m not trying to insult you, I just am trying to be honest here about my motivations.)

      Also, there is discussion of your project here https://lobste.rs/s/mltpfg/xit_is_coming . If you want, I can give you invite

      1. 1

        From https://github.com/radarroark/xit/blob/master/docs/db.md :

        But for me, the bigger problem is that git’s data structures are just not that great. The core data structure it maintains is the tree of commits starting at a given ref. In simple cases it is essentially a linked list, and much like the linked lists you may have used, it can’t efficiently look up an item by index. Want to view the first commit? Keep following the parent commits until you find one with no parent. Want to find the descendent(s) of a commit? Uhhh…well, you can’t.

        Ha! This is very good selling point for xit!

        1. 2

          Uhhh… well, you use the commit-graph file

          (one of the results of Derrick Stolee’s performance improvement work)

        2. 1

          Please, add “ai” tag :)

          1. 40

            Why? Now this story will be hidden for everyone who filters on the ai tag… but the story isn’t primarily about AI (it just mentions bots at one point).

            I enjoyed reading this but I would have missed it if it was tagged ai when it was submitted.

          2. 2

            In my mind, a more advanced memory-safe system will likely overtake Rust and become adequate for the task of implementing codecs and other low level tools (perhaps integrated with a theorem prover and a proof assistant that lets the programmer clarify the assertions, preconditions, postconditions and invariants)

            Wuffs is exactly that! It is designed for compressors! ( https://github.com/google/wuffs )

            1. -10

              So it is okay to write such long annotations to one’s own postings? Okay, I will do so next time :)

              1. 18

                I think it’s not about yours/other’s, but rather the fact that this is a link to raw source code, not to a blog post about the source code.

                That is, you shouldn’t contextualize something which already explains itself, but, if you are linking to a primary artifact, some explanation is in order!

                1. 7

                  I think this is mistagged, it should be in the “show” category, where it is common to actually write a contextualising comment.

                2. 2

                  @surprisetalk , I hope this comment will be very useful to you.

                  I want some format, which is typed (similarly to protobuf), but embeds its type in the header.

                  Also (assuming it exists) I want to write implementation, so I have a requirement: the format should be well-designed and should have clear spec.

                  I spent a lot of time searching for such format and found exactly one popular format: Avro (more precisely: Avro object container file). Unfortunately, its spec is awful: https://www.mail-archive.com/user@avro.apache.org/msg04592.html (this is list of 20-30 spec problems written by me). (This problems matter, because I wanted to write my own Avro library.) (It is possible that the situation improved since I wrote that mail.)

                  So, it seems format I want simply doesn’t exist.

                  So, I plan to create one in the future. (This may be very distant future, say, 10 years. Or simply never.)

                  Of course, feel free to steal this idea.

                  Also: data-together-with-its-own-type is essentially instance of sigma type, i. e. existential type. (This is how such type is called in PLT [programming language theory].) So, Avro object container file is instance of some particular sigma type (which is always same for all Avro files). In other words, it is possible to create Avro library in some dependently typed language (Rocq, Lean, Idris, Agda) and that library will be very natural, as opposed to libraries in “normal” languages, such as C++ or Rust or JavaScript. You will be able to represent every Avro file as an instance of a single sigma type. And you will be able to manipulate Avro values very naturally. (In a way impossible in all other languages, both typed and untyped.)

                  Unfortunately, Avro community seems to be totally unaware about this. It seems Avro community and PLT community are absolutely disconnected. (And, of course, there is no Avro library for Rocq/Lean/Idris/Agda.)

                  Avro seems like enterprisish stuff, and Avro people seem not care about all this PLT. It is possible I’m first who noticed this link between Avro and dependently typed languages.

                  Anyway, I think that new format for typed-data-with-its-own-type should be created. I hopefully will create such format in 10 years. (Well, I have prototype written in C++.)

                  Ask me any questions

                  1. 5

                    I decided to use ASN.1/DER in a recent project because I was feeling old school lol

                    1. 3

                      ASN.1 has a gigantic specification. It looks like written in totally different era. (I would say “pre-Github” era.) It looks like written in era when programming had nothing common with fun. It looks like written by managers, not by programmers, and especially not by programmers who love their work. This spec scared me away completely, and I will never consider ASN.1 for my projects

                      1. 2

                        And proprietary ASN.1 compilers are another argument against ASN.1

                        1. 2

                          Yeah I mean it certainly was written in a different era (the 80s, I believe). I quite like it though. And DER (one of the main encodings for ASN.1 messages) is actually very simple FWIW. But sure I mean I doubt anyone’s gonna come along and pressure you to use it in your projects (though you may already be using it without realising if you’re using GSM, PKCS, Kerberos, etc). I for one find it quite satisfying to use the same serialisation/deserialisation paradigm being utilised in other parts of the stack but that’s just me haha

                          1. 1

                            It was written by telecoms companies in the early 80’s, before TCP/IP created the internet and made most other networking technologies obsolete. Not just the pre-Github era, but the pre-PC and pre-internet era. Lots of stuff from the 50’s and 60’s looks like that; to some extent it wasn’t until the late 60’s and 70’s that “programmers who love their work” became a powerful technical force outside of universities. You aren’t allowed to have fun if you work for a government or giant company on machinery that costs more than your entire life does.

                          2. 3

                            It’s been a long time since I’ve looked at ASN.1 outside of X.509 but you inspired me to go look. I’m pretty surprised to see that there really isn’t much that Protobufs do that ASN.1/DER does not. Varints being an easy example of something that Protobufs do but… wow are they similar!

                            1. 5

                              Yeah they really do cover a lot of the same ground! But then with ASN.1 it’s so much more standardised, I feel like it’s massively underrated. I also find ASN.1’s ability to be encoded in a bunch of different ways depending on what’s appropriate for the situation-at-hand super cool, and it’s something that Protobuf doesn’t really do :)

                            2. 2

                              Did you use an ASN.1 compiler? Was it a commercial one you already had access to, or is there a FOSS one you like?

                              1. 2

                                Yeah I’m using Erlang/OTP’s built-in ASN.1 compiler, which is very nice. On the client side (a Vala/GTK app) I’m currently just deserialising directly from the DER right now, but I’ll probably switch to using a proper ASN.1 compiler at some point (or maybe I’ll make my own limited one for Vala, who doesn’t love a good yak shave). I’ve used asn1c and asn1scc before too, both have worked well for me.

                            3. 7

                              try this Alta Vista search instead

                              Wow, this was written before Google was the thing

                              1. 21

                                I was always amazed by how over-engineered Android’s calculator app seems to be (it’s been this way for as far as I can remember, too, at the very least since Android 5), I love that you can just scroll horizontally across a real number and get more and more digits out. Great to know how that works!

                                1. 1

                                  I have Android 7, and on my calculator version you cannot scroll. Moreover, 10^100+1-10^100 gives 0

                                  1. 1

                                    Weird! I vividly remember noticing this after flashing LineageOS (or maybe it was still CyanogenMod?) on my Samsung phone ~2014 (the phone was originally running Android 4.1 and Samsung’s god-awful TouchWiz UI, that much I am sure of, I don’t think I’ve ever hated a UI nearly as much as I hated TouchWiz).

                                    1. 1

                                      App info for calculator says: “App downloaded from Google Play Store”. Version 6.0.63.9. My phone is Samsung Galaxy S6 SM-G920F. I bought it a long time ago. In 2014 or something like that

                                2. 45

                                  And the Linux kernel maintainers are just cool with this, huh? A disagreement over programming languages where the old guard refuses to give a single millimeter while folks are turning out high-quality, production code, leads to middle-school-caliber cliques, personal abuses, and predictably, maintainer burnout and resignations.

                                  The Linux community is so much less with Marcan’s resignation, but it’s also so much less if nobody will rein in the LKML folks and maintainers, because this abuse will continue until Linux is an OS only for enterprise servers or things that can broadly look like them insofar as hardware support goes (see, Marcan’s notes about the death of the hobbyist hacker in the kernel and core low-level userspace). Linus - where you at, man?

                                  1. 15

                                    I predict that if Linus steps in to issue a BDFL judgement, it will favor the LKML old guard.

                                    Linux is already owned, for all practical purposes, by the LF member organizations, who are all big companies. Their common interest doesn’t favor technical innovation nearly so much as it does stability for existing business-critical application. So, yeah, enterprise servers, Android, and some of the embedded market.

                                    1. 28

                                      That’s completely wrong.

                                      Red Hat is funding the development of Nova, and graphics drivers are pretty much the first real world test for Rust in Linux. Microsoft employs several maintainers of the R4L core. Google pretty much was the first to look at Rust for a Binder replacement. All of them are also working on other projects adjacent to OSes in Rust (for example OpenVMM, Fuchsia, Coconuts Sam which was started by SUSE). So there’s a lot of desire for innovating with Rust among those companies.

                                      1. 13

                                        I would argue that the Asahi GPU drivers were the first test, but that’s basically quibbling, they’re the first two for sure.

                                        1. 4

                                          Yes, Asahi GPU drivers are part of the “graphics drivers” category. :) Nova is the second within the category of graphics drivers.

                                          Graphics drivers are also obviously dependent on the DMA functionality, meaning that Red Hat has an incentive to find a solution to the current issue and it will help Asahi as well.

                                          BTW the phone made a mess of CoconutSVSM.

                                        2. 2

                                          My point wasn’t that LF members don’t want to use Rust in general (which is clearly false!) but that they may not have much shared incentive to advocate for Rust in the kernel. The R4L statements of support come from a much smaller set of orgs than the set of LF members. Funding or labor contributions aren’t as easy to find, but I’d be willing to bet we’re talking at least 2 orders of magnitude difference, and probably 3.

                                          1. 6

                                            Though note that some of those companies supporting R4L are also some of the biggest contributors to Linux. The Linux Foundation itself doesn’t do much Linux development, and most LF members don’t do tons of Linux development. So a raw count doesn’t tell you much.

                                        3. 16

                                          That would have been more true a couple of years ago, though. At the moment, those are the same groups coming under pressure to visibly start improving memory safety, and “we’re adopting Rust” is an easy signal to use.

                                          1. 5

                                            At some point it’s easier to get memory safety features in standard C than to rewrite the Kernel significantly in Rust.

                                            C23 didn’t standardize “attribute cleanup” but it’s not far fetched. It has memset_explicit which is a security-only feature.

                                            As much as it is an important topic memory safety is clearly not a high priority for big players otherwise C wouldn’t be the most popular language for the past 50 years in OS development.

                                            Rust makes a big deal out of it (and it is) but maintainers and companies are willing to trade all of it off for plain stability and compromise.

                                            1. 28

                                              At some point it’s easier to get memory safety features in standard C than to rewrite the Kernel significantly in Rust.

                                              Not really. The same problem exists here as with Rust - working with upstream. Grsecurity had open source patches for decades with mitigations that reduced vuln impact radically, across entire classes of attacks. LF could have paid Brad to upstream but they didn’t for political reasons, and when there have been more politically minded folks like Kees who have tried it’s taken decades just to cherry pick basics, and that’s been met with massive criticism and public insults from upstream as well.

                                              The problem isn’t Rust, the problem isn’t C, the problem is and always will be upstream.

                                              otherwise C wouldn’t be the most popular language for the past 50 years in OS development.

                                              Massive survivorship bias etc. All major kernels have been in C, thus those interested in kernel dev tend to learn C, thus new kernels are written in C.

                                              1. 20

                                                At some point it’s easier to get memory safety features in standard C than to rewrite the Kernel significantly in Rust.

                                                So far, no one has managed to invent a good memory safety scheme that doesn’t require a lot of invasive changes to C or C++ code to work. Sean Baxter’s Safe C++ just added Rust’s borrow checking feature to C++, which ofc existing code structures don’t fit. I’m not aware of any proposal that would actually solve memory safety in C, just vague hand waving that it’s totally possible without too much effort.

                                                1. 16

                                                  At some point it’s easier to get memory safety features in standard C than to rewrite the Kernel significantly in Rust.

                                                  I’ve thought a lot about this. I hate how complicated Rust is and I would love there to be a simpler way to get what Rust offers. But all the proposals to make C more memory-safe are laughably weak in comparison, and when I think about what it would take to draw the rest of the owl, I end up with something pretty close to Rust.

                                                  In particular, we are not yet (and perhaps won’t ever be) able to prove the safety of real-world C (or any other sufficiently versatile memory-unsafe language) in general. So if we are to have any hope of solving this problem we need a way to encapsulate things whose safety can’t be automatically proved. I think it’s fair to say this implies generic programming capabilities on par with Rust’s.

                                                  Also, while bounds checking is somewhat feasible to retrofit to C if you’re willing to make it slower and break all ABI compatibility everywhere, memory management is not. There’s a reason nobody uses attribute cleanup: it’s just not very useful. Heck. C++ has just about every memory management aid short of a borrow checker, and none of that seems to be enough. Rust makes this problem tractable by restricting what you can do, which feels bad to me coming from C, but we can’t statically solve the general problem today, and the runtime option is basically garbage collection. Practically speaking you cannot apply Rust’s solution to C either because you’d have to rewrite enough C to totally defeat the point.

                                                  Maybe we’ll get garbage-collected, bounds-checked C (I guess that’s what CHERIoT is) before we get a critical mass of software written in Rust. I honestly have no idea how to call it. But I don’t think there is any change you could reasonably make to C to create a path of less resistance than “rewrite it in rust” or “magical memory-safe hardware”, and the not-memory-safe hardware that’s in use today will probably go on being used for some time.

                                                  Rust makes a big deal out of it (and it is) but maintainers and companies are willing to trade all of it off for plain stability and compromise.

                                                  I think you could say the same thing about safety (the other kind) improvements in many fields. People who make and sell products have, historically, needed to be threatened with quite big sticks to spend money on making the products not hurt people. At some point I think some stick-waving will take place in software too.

                                                  1. 6

                                                    Heck. C++ has just about every memory management aid short of a borrow checker, and none of that seems to be enough

                                                    I’ve had the “privilege” of building and maintaining a now 6-year-old C++ application. Originally built as a prototype in early 2020, we chose C++17 because it was the newest standard that was supported by the embedded Linux distribution provided by our vendor. The application is very data-heavy and allocation heavy (soft real-time processing of 25MP images at 20fps). It has gone through a ton of evolution and has had modules written by… politely… people who aren’t C++ memory management experts.

                                                    While the number isn’t 0, we have had very very few bugs that were due to memory safety issues. The keyword new is not used explicitly anywhere in the codebase; std::shared_ptr and std::unique_ptr are used heavily. Arrays and vectors are very rarely accessed directly by index but rather using for (auto elem: vec) {...} or other iterator techniques. We don’t, currently, have any static analysis beyond compilation in our CI process; at one point we did use SonarQube and it did a good job of pointing out some questionable stuff in some of the early code.

                                                    You’re right, compared to Rust, our code has not been perfect from a memory-safety perspective. At the moment, through its current execution paths, it appears to be, at least according to valgrind. We’ve soak tested it for weeks on spare hardware and don’t seem to have any memory leaks. We’re doing thousands of allocations/sec with small objects, and enough large allocations every second to consume all 64GB of system RAM in about 2 minutes.

                                                    Might there be issues lurking somewhere? Maybe. From an engineering perspective though? This system works fantastic and if there are issues related to memory corruption they are incredibly rare in practice.

                                                  2. 4

                                                    It’s worth observing (as more follow-up/color than any kind of “rebuttal”, since I don’t really disagree!) that broader standards like C23 are less relevant for projects like the Linux kernel. There was some early activity in the 2009 to 2013 era more based on clang analyzer & warnings, but it probably wasn’t until late 2013 about 11.5 years ago that they even regularly tested with clang. From 1990 until then (aka a quarter of a century), the Linux kernel was basically gcc-specific C. Any compiler that wanted to work with the Linux kernel was/is beholden to plenty of gcc-extensions like its inline asm syntax. So, the reality today is probably “compatibility with whatever extensions both clang & gcc support”, not any specific standard. (EDIT: and I make no claim as to “how close” to “good enough” various notions of “safety” one could realize this way. It’s more a point about C23 relevance vs. compilers willing to experiment more than a standards committee might be. A related observation might be that even kernel-specific C extensions for new static analysis like sparse have been considered fair game.)

                                                  3. 1

                                                    I don’t doubt it, but could you elaborate about the pressure? Who’s exerting it, what leverage do they have? Maybe provide some examples?

                                                    1. 11

                                                      As an example of pressure from one direction, the NSA report on memory safety from 2022 led eventually to a National Cybersecurity Strategy with a pillar about “shifting software liability” and USG investment in “memory-safe languages”. And this was repeated by all the Five Eye governments. That sort of thing implies it’ll be on the agendas of regulators, procurement officers, investors, and insurance companies, which puts it on the agenda for CIOs and CTOs.

                                                      We also have people like Mark Russinovich (Microsoft Azure CTO) saying C/C++ should not be used in new projects, and Microsoft does quite a bit with Linux nowadays.

                                                      The EU introduced product liability rules recently that may be a big redirection of priorities for embedded and appliance software folks.

                                                      1. 6

                                                        Several governments, chiefly the US federal government. Also, while it doesn’t directly mention memory safety (I believe), the EU Cyber Resilience Act does mandate stricter cybersecurity standards and puts heavier liabilities on companies for faults in products with software elements. The oft-quoted figure that 70% of vulnerabilities are caused by memory safety issues is presumably not universal, but if it is true for Google and Microsoft products, then it seems like they have a lot of incentive now to make sure their products are memory safe.

                                                        1. 15

                                                          The oft-quoted figure that 70% of vulnerabilities are caused by memory safety issues is presumably not universal, but if it is true for Google and Microsoft products, then it seems like they have a lot of incentive now to make sure their products are memory safe.

                                                          Alex Gaynor’s What science can tell us about C and C++’s security gave the following numbers back in May 2020:

                                                          • Android (cite): “Our data shows that issues like use-after-free, double-free, and heap buffer overflows generally constitute more than 65% of High & Critical security bugs in Chrome and Android.”
                                                          • Android’s bluetooth and media components (cite): “Use-after-free (UAF), integer overflows, and out of bounds (OOB) reads/writes comprise 90% of vulnerabilities with OOB being the most common.”
                                                          • iOS and macOS (cite): “Across the entirety of iOS 12 Apple has fixed 261 CVEs, 173 of which were memory unsafety. That’s 66.3% of all vulnerabilities.” and “Across the entirety of Mojave Apple has fixed 298 CVEs, 213 of which were memory unsafety. That’s 71.5% of all vulnerabilities.”
                                                          • Chrome (cite): “The Chromium project finds that around 70% of our serious security bugs are memory safety problems.”
                                                          • Microsoft (cite): “~70% of the vulnerabilities Microsoft assigns a CVE each year continue to be memory safety issues”
                                                          • Firefox’s CSS subsystem (cite): “If we’d had a time machine and could have written this component in Rust from the start, 51 (73.9%) of these bugs would not have been possible.”
                                                          • Ubuntu’s Linux kernel (cite): “65% of CVEs behind the last six months of Ubuntu security updates to the Linux kernel have been memory unsafety.”

                                                          That’s basically every OS and browser engine with significant market share. (If anyone has cited numbers on the BSDs, please let me know. I don’t have time to go digging myself.)

                                                          1. 3

                                                            many of those projects also use tools like valgrind regularly in their builds from what I recall. It really looks like at some point or size in a codebase if it is C/C++ you are almost guaranteed to have memory safety vulnerabilities no matter what you do to prevent it.

                                                            1. 3

                                                              I prefer to describe it as “Regardless of how capable of writing correct C or C++ individuals are, it’s been shown that people in groups cannot”.

                                                              It’s more diplomatic, given how emotionally invested in their own skills people can get.

                                                              …and more recently, I’ve started describing Rust as a lockout-tagout system for software development, based on the ways its designers chose to focus on maintainability of projects written in it. (eg. working to avoid the need to reason globally, providing a powerful type system for encoding invariants, etc.)

                                                  4. 6

                                                    Honestly, Linux won’t be the dominant “new” server in a few decades and it’s exaaaactly because of the way upstream is supported. Linux will be relegated to “weird” hardware that IBM pays a fortune to maintain and provide enterprise support for.

                                                    Linus - where you at, man?

                                                    He’s a huge part of the problem. Always has been.

                                                    1. 1

                                                      Marcan’s notes about the death of the hobbyist hacker in the kernel and core low-level userspace

                                                      Where is this? I cannot find

                                                    2. 1

                                                      Also: GIF (and LZW) patents expired in 2004

                                                      1. 1

                                                        How it is possible to return stack-allocated dynamic array from a function?!

                                                        1. 5

                                                          IIRC, they allocate a secondary stack and store the arrays there, it’s not “on the stack” in the sense you’re imagining.

                                                          1. 1

                                                            Yes, and it’s fascinating!

                                                            I found a great, short summary of how it works here: https://nytpu.com/gemlog/2024-12-27-2

                                                            1. 4

                                                              I haven’t read the link yet, but is this not just similar to a kind of arena allocator.

                                                              Edit: I should have read the article first.

                                                              The secondary stack is basically just an arena allocator with incremental deallocation. When an object goes lexically out of scope in code then the compiler will mark it as unused, and then the runtime will try to roll back the secondary stack as far as possible without clobbering live data (potentially rolling back other regions marked unused if they weren’t cleaned up before)

                                                        2. 3

                                                          @david_chisnall

                                                          p. 16 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf )

                                                          when ECC bits are used to hold metadata

                                                          Wait, what?! CHERI uses ECC bits for its own needs and thus we get no ECC?! This is unacceptable.

                                                          I type this on laptop with ECC. Yes, I intentionally bought laptop with ECC. I don’t want CHERI to disable ECC

                                                          1. 6

                                                            First: no, that is not essential, most implementations do not use ECC bits, Morello did as an experimental option but there are a number of performance-related downsides to this.

                                                            Second, the DDR5 spec permits a variable number of metadata bits. ECC schemes use some of these. If you wanted to use some of these for CHERI then there can still be enough left over for ECC.

                                                            1. 1

                                                              I’m amazed they make laptops with ECC. I’d love it if ECC memory was basically the default and it being hard to not get it, but sadly we don’t live in that world.

                                                              I did a web search and I found one vendor(Eurocom) that did it, in their laptop server line, and their bottom spec laptop with ECC was a little under $10k USD.

                                                              If you happen to know of a different, less expensive option, I’d love to hear about it.

                                                              1. 2

                                                                I’ve got Dell Precision 7780. I bought it for 5000 USD. But it is highly configurable, so you can construct cheaper configuration. Also see:

                                                                • Other Dell Precision models
                                                                • HP zbook fury g10 (and other zbook fury)
                                                                • lenovo thinkpad p16 (and other thinkpads)

                                                                Also: my laptop seems to be the only laptop out there, that has ECC and has no nvidia. This was important for me, because nvidia usually works badly with Linux. Also, my laptop comes with Ubuntu preinstalled.

                                                                Also see: https://lobste.rs/s/scllqn/ecc_ram_on_amd_ryzen_7000_desktop_cpus

                                                                1. 2

                                                                  Also: when you search laptops on Dell’s, Lenovo’s, etc sites, specify US as your location, even if you live somewhere else. You will be able to see more ECC laptops :)

                                                                  1. 1

                                                                    afaik Lenovo, Dell and HP have options in their workstation laptop ranges. Thanks to market segmentation you need an explicitly Xeon-branded CPU, and even then not all offer it.

                                                                    It’s a bit unclear to me what the situation is in AMD land: a lot of AMD desktop CPUs do support ECC in principle, but I don’t know if that translates to the mobile models and if anyone actually makes laptop mainboards that also support it.

                                                                    EDIT: apparently the Xeon-bit is outdated, P16 spec sheet doesn’t use that branding and still claims ECC options: https://thinkstation-specs.com/thinkpad/p16-gen-2/

                                                                    1. 1

                                                                      Interesting!

                                                                      I agree, the P16 says it’s supported, which is awesome.

                                                                      I checked lenovo.com and they won’t let you buy a P16Gen2 with ECC. So perhaps it’s supported and you just buy it with the smallest memory option and then swap it out. Seems the HP website is the same way. I couldn’t find any that would ship with ECC.

                                                                      I tried the Dell website, without trying very hard I couldn’t even figure out how to change the memory configuration on their laptops. None seemed to say they offered ECC as something you could get shipped.

                                                                      I remember hearing that AMD was going to offer ECC support in their entire CPU range, but like you have never seen ECC ram in the wild on any AMD machine that wasn’t specifically a server.

                                                                      1. 3

                                                                        you have never seen ECC ram in the wild on any AMD machine that wasn’t specifically a server

                                                                        here we have ECC memory on desktop amd (but this is not laptop): https://lobste.rs/s/scllqn/ecc_ram_on_amd_ryzen_7000_desktop_cpus

                                                                  2. 2

                                                                    Okay, I’m on p. 15 of extended version now ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ).

                                                                    Conversely, a customer might be willing to pay a premium for strong memory safety but lack the means to communicate this preference effectively or verify that a vendor’s product meets their needs

                                                                    Okay, after reading this sentence it seems I understand now what authors are proposing. It seems authors propose essentially something like “This thing is compatible with Windows Vista” stamp.

                                                                    I. e. some kind of logo or stamp attached to software products. I. e. “This product has level A7 of memory safety”. And consumers will be able to choose product based on this stamp.

                                                                    Well, such system is not as bad as it seems. In fact (without any irony) I think this will be improvement over current state of things.

                                                                    But to be actually useful such system should take into account real level of memory safety. For example, Rust is currently not as memory-safe as it seems to be, because it is full of unsoundness bugs (they are labeled as “I-unsound” in bug tracker). And no, Ferrocene Rust is not a solution. Their changes to compiler are minimal. I recently did diff between upstream Rust and Ferrocene. Changes are minimal. Changes are mostly in tests and in support for particular hardware. No attempt was made for fixing real unsoundness bugs.

                                                                    On the other hand, SML ( https://en.wikipedia.org/wiki/Standard_ML ) is a small and very rigorously specified language. Due to its small size (and high quality spec) it is possible to implement very reliable and secure compiler.

                                                                    So, I want that hypothetical stamp system to assign higher safety levels to programs written in SML than to programs written in Rust. If it does, then this will be absolutely amazing

                                                                    (I’m on p. 15, I have not read after that page yet)

                                                                    1. 1

                                                                      p. 14 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ).

                                                                      …through mechanisms like after-market solutions, disaster recovery, and national security implications, with billions of dollars in damage

                                                                      I think authors meant this:

                                                                      …through mechanisms like after-market solutions and disaster recovery with national security implications and billions of dollars in damage

                                                                      Again, very stupid typo for such ambitious paper.

                                                                      Am I supposed to read “extended report” at all?

                                                                      Maybe that “extended report” is just a draft and we all should read main version ( https://cacm.acm.org/opinion/it-is-time-to-standardize-principles-and-practices-for-software-memory-safety/ ) instead? Maybe main version contains less typos? Then remove link to that “extended version”. Or add caution “extended version is buggy”

                                                                      1. 0

                                                                        @david_chisnall

                                                                        Okay, I found real typo

                                                                        p. 12 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ).

                                                                        Footnotes.

                                                                        Links for eBPF and wasm are swapped. (facepalm)

                                                                        Did anybody read whole thing before publication?

                                                                        How this paper got DOI with such stupid typo?

                                                                        Who is intended audience for this paper? US government? Why not read whole paper one more time before giving it to US government?

                                                                        1. 1

                                                                          p. 11 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ).

                                                                          The paper lists CHERI as one of “systems that deterministically detect violations of memory safety at run time”.

                                                                          The word “deterministically” seem to imply that CHERI detect every memory violation. But, as well as I understand, this is not true. Let’s assume that we compile some C program to CHERI. Let’s assume we have char *a in that program which points to 100 chars. Then, as well as I understand, b = a[103] will not necessary be detected by CHERI, because, as well as I understand, CHERI uses some special compressed pointer representation, which detects access violations in most cases, but not in all cases

                                                                          1. 2

                                                                            Then, as well as I understand, b = a[103] will not necessary be detected by CHERI, because, as well as I understand, CHERI uses some special compressed pointer representation, which detects access violations in most cases, but not in all cases

                                                                            Even in the smallest systems, you have byte precision up to half a KiB, for the 64-bit systems it’s over a megabyte. That said, your example is not a memory-safety vulnerability unless there is another object that is immediately after the buffer. Current CHERI implementations do not have this problem, they will insert padding where necessary.

                                                                          2. 2

                                                                            p. 10 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ).

                                                                            Table in the end of the page is misleading. The first row pretends to be about “Fully memory-safe… languages”. Yet it lists Rust as one of examples. But Rust currently is full of unsoundness bugs (they are marked as “I-unsound” in bug tracker). Thus right now Rust doesn’t count as “fully memory-safe” language.

                                                                            (I have not read after p. 10 yet.)

                                                                            1. 1

                                                                              Is this okay to report about possible typos here?

                                                                              p. 10 of extended report ( https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-996.pdf ) says: “Rust, Python, Swift, Java, C#, SPARK, and OCaml – excluding code in their unsafe TCBs (e.g., Unsafe Rust); memory-safe C++ subsets”.

                                                                              As well as I understand, TCB means “trusted computing base”. “TCB” seems to be inappropriate here, authors probably meant “in their unsafe subsets” (or supersets, or flavors, or variants, etc)

                                                                              1. 3

                                                                                I think it’s correct. If you want to trust the safety of a c program, you have to trust every line. If you want to trust the memory safety of a rust program you only have to trust the unsafe bits. So you memory safety trusted computing base is only the unsafe rust parts.

                                                                                I guess it’s not that the TCB is “unsafe rust”, but that the TCB of any rust application is the subset of it in unsafe rust.

                                                                                (and all the TCB stuff is only regarding memory safety)