Threads for JoachimSchipper

  1. 1

    This is great! My one suggestion would be to host the python reference code on a site where people can read it in their browser, instead of sending a tarball… especially because it’s so short, concise, and pretty.

    1. 2

      That’s actually a neat suggestion. I’ll see about distributing it online in addition to the tarball. I’ll also need to find a way to remove that pesky licence code from the HTML version… oh, and syntax highlighting…

      Do you know of a command line program that can turn Python code into pretty HTML?

      1. 4

        vim can export syntax highlighted code to styled HTML, built right in:

        see :h TOhtml or this useful SO post

        1. 3

          Pygments is the standard solution, I think.

      1. 3

        I’m generally an advocate of C++ but rather than “cute”, I found this implementation (of casting memcpy to a different type of function, which “works” based on knowledge of underlying ABI) kind of horrific. Surely this is the sort of thing that should be done with a compiler builtin, rather than relying on specific semantics for code that definitely invokes undefined behaviour.

        1. 2

          I agree that this code is very nonportable, but this is a compiler / standard library fork. (As a proof-of-concept for a proposal to change the standard.)

        1. 2

          Although it’s good to see an attempt at security against “evil maid” attacks (via the TPM2) for Linux, the proposed approach seems rather unprincipled:

          • the “payload” for the OS - the given example is an e-mail client - does not appear to be integrity-protected (or did I misunderstand that part?). My e-mail client holds some of my most important data (including the ability to reset passwords for many online services)!
          • the root filesystem seems to be designed to be integrity-protected by a TPM2-bound (i.e. hardware-bound) key, but rollbacks are not prevented (so a user could be un-locked, or a service could be un-disabled)
          • the root filesystem is actually encrypted, and “integrity-protected” by btrfs checksums (inside the encrypted container); that’s… not a standard cryptographic construction.

          (Also, this is very much not a traditional Unix system, and I like traditional Unix systems.)

          1. 14

            This reads like a puff piece. It’s an interesting project but I wouldn’t say there was a real takeaway except that you have YC funding now.

            1. 11

              Ouch; this is very unconstructive criticism.

              1. 4

                I liked the article as an experience report - you can build something Erlang-ish in Rust on wasm and end up at least convincing yourself (and YC?) that it works. I agree that the article doesn’t have a strong central thesis, but I found it interesting.

              2. 11

                Sadly I believe you’re correct, especially given the post history here.

                For folks that quibble with this dismissal as a “puff piece”: for me at least if this post had any code at all showing how the APIs changed, how this mirrored GenServers or other BEAM idioms, how various approaches like the mentioned channels approach changed the shape of could, or anything like that I wouldn’t be so dismissive. Alas, it seems like a growth-hacking attempt with lots of buzzwords (I mean christ, look at the tags here).

                Marketing spam and bad actors still exist folks.

                1. 2

                  Hi friendlysock, I do mention in the post “Check out the release notes for code examples”. Here is a direct link to them: https://github.com/lunatic-solutions/rust-lib/releases/tag/v0.9.0

                  1. 6

                    From (successful) personal experience: you can get away with promoting your stuff if you offer people something of real value in exchange for taking their time & attention. Nobody cares what’s in your GitHub: make content that is on the page you are posting that is worth reading.

                    1. 5

                      Friend, your only contributions to this site have been entirely self-promotion for your Lunatic project. It’s a neat project, but you are breaking decorum and exhibiting poor manners by using us in a fashion indistinguishable from a growth hacker. Please stop.

                      1. 1

                        I don’t think it’s fair to call a blog that has 3 posts in 2 years “marketing spam”. This submission is currently #1, so it’s obviously of interest to the community. But with this backlash in the comments I’m definitely going to refrain from posting in the future.

                        1. 19

                          I don’t think it’s fair to call a blog that has 3 posts in 2 years “marketing spam”.

                          In one year, as I write this comment, you have:

                          • Submitted 3 stories, all self promotion.
                          • Made 5 comments, all on stories that you submitted, promoting your own project.

                          That is not engaging with this community, that is using the community for self promotion, which is actively contrary to the community norms, and has been the reason for a ban from the site in the past.

                          This submission is currently #1, so it’s obviously of interest to the community.

                          The rankings are based on the number of votes, comments, and clicks. At the moment, all of the comments in this article are either by you, or are complaining about the submission. This will elevate the post but not in a good way.

                          But with this backlash in the comments I’m definitely going to refrain from posting in the future.

                          I would say that you have two choices:

                          1. Stop posting altogether.
                          2. Engage with the community, comment on other stories, submit things that are not just your own work.

                          The general rule of thumb that I’ve seen advocated here is that posts of your own things should make up no more than 10% of your total contributions to the site. At the moment, for you, they are 100%. If they were under 50%, you’d probably see a lot fewer claims that you were abusing lobste.rs for self promotion.

                          1. 4

                            I don’t know how to resolve the problem that this is both an interesting project but only being posted by you, and that there’s a business wrapped around it, where you’re the ‘CEO’ - which just makes it a bit awkward when people are interested in the tech but opposed to ‘spam’.

                            I’m certainly interested in following the project, so I’d prefer that you keep posting!

                  1. 2

                    You may be wondering why this is just coming to light now, when Java has had ECDSA support for a long time. Has it always been vulnerable?

                    No. This is a relatively recent bug introduced by a rewrite of the EC code from native C++ code to Java, which happened in the Java 15 release. Although I’m sure that this rewrite has benefits in terms of memory safety and maintainability, it appears that experienced cryptographic engineers have not been involved in the implementation.

                    Couldn’t an end-to-end fuzz test on the original and new code catch this? Not sure.

                    I think it’s clear that security-sensitive code should be evaluated by experts. But the recent trend to rewrite other core infrastructure in Java / Go / Rust gives me pause.

                    Fuzz both sides, get an expert, or just let the original code be?

                    1. 3

                      This is a very easy bug to find - just trying all-zeroes will work, and most testing strategies should test all-zeroes.

                      In general, though, crypto code can break only on very particular inputs (e.g. carry-chain bugs). You want expert review, and/or a careful code comparison against the original code (which would have worked!), and probably something like Project Wycheproof, which collects a number of test vectors (etc.) for specific algorithms.

                    1. 1

                      Probably not what you’re looking for, but Windows seems to be getting more and more VM-based isolation, e.g.

                      Microsoft’s solutions tend to be a lot more transparent than Qubes, which is both good (security you don’t use doesn’t help), and bad (Qubes will make you think more seriously about crossing security domains, and e.g. disposable VMs can help recover from undetected compromise.)

                      1. 2

                        Great, but overall more and more anti-customer “features” are forced onto users, often without warning and often really forced upon, with no option to deny.

                        With that it’s hard to consider the OS even remotely secure.

                      1. 5

                        This is a really nice book. I still find myself reaching for OO in Python when I want to organize my code, and it is great to have the design patterns so that I can avoid reinventing the wheel every time.

                        By the way, the author’s blog is also really great.

                        1. 5

                          I’m a big fan of the author’s essay on semantic linefeeds. That essay completely changed the way I write LaTeX. (E.g., https://git.sr.ht/~telemachus/socratic-notes/tree/main/item/crito.tex.)

                          In fact, I think I’ll post that again now…

                          1. 2

                            Since we’re talking TeX tricks: what did you achieve by doubling up the section headers with [[-? I see you have e.g.

                            % [[- Socrates initial response (46b1--c6)
                            \subsection*{Socrates initial response (46b1--c6)}
                            

                            … but e.g. vim-latex supports folding sections directly. Of course, there’s lots of editors that are not vim, and many things one could want which are not folding…

                            (Always happy to learn a new trick!)

                            1. 3

                              Sorry: no new trick. I do use vim, and those are (custom) fold markers. I don’t use vim-latex, even though I appreciate a lot of what’s in it.

                              But vim-latex does a lot more than I want. Rather than turning off large parts of it and tweaking others, I borrowed or adapted the pieces I liked. I did a similar thing with vim-go and made my own vim-go-mini. Some of the larger vim plugins move too far (for me!) in the direction of treating vim like an IDE. It’s silly, but I like things just so. Also, I’ve learned a lot about vim by writing some of my own plugins.

                              1. 2

                                I see. Thanks for your response.

                                … I must admit that I just use standard LaTeX syntax highlighting in vim, and nothing else. But that’s more due to using a bunch of different environments than due to any particular opinion about vim-latex in particular. I do share your feeling that trying to make vim into an IDE doesn’t necessarily result in a better editing experience.

                                1. 2

                                  Do you have your adapted vim latex files somewhere? I dislike vim-latex for exactly the same reason. It does quite a lot more than what I want, and in many cases, does things I don’t want.

                                  1. 1

                                    Annoyingly, I haven’t put everything together neatly like I did with vim-go-mini. I should, and (hopefully?) this will push me to do that.

                                    In the meantime, there’s a bunch of LaTeX-related stuff in my vim-dotfiles.

                                    In particular, maybe look at these?

                                    (I have a LaTeX snippets file and a bib snippets file, but those are probably too trivial to be useful to anyone else.)

                                    One other thing that may be helpful: I put together a little plugin to get autocompletion from local .bib files.

                              2. 2

                                Loved the “venerable TROFF” reference in that essay. Brings back fond memories of the typesetting struggles pre-LaTeX.

                            1. 4

                              Really great approach! The whole post showcases quite pragmatic approach to the problem they had and how they optimized it for. My favorite sentence is at the end:

                              The library works for our target platforms, and we don’t wish to take on extra complexity that is of no benefit to us.

                              Truly pragmatic approach: it works for us, if you want, fork and modify to oblivion, we give you a starting point. Wish more software is handled like this and doesn’t include all but kitchen sink.

                              1. 0

                                More and more people are fortunately adopting the suckless philosophy.

                                1. 15

                                  Unfortunately, the suckless philosophy leads to software that doesn’t do what software is meant to do - alleviate human work by making the machine do it instead. The suckless philosophy of “simplicity” translates to “simplistic software that doesn’t do what you would need it to do, so either you waste time re-implementing features that other people have already written (or would have written), or you do the task by hand instead”.

                                  If, when required to choose exactly one of “change the software to reduce user effort” and “make the code prettier”, you consistently choose the latter, you’re probably not a software engineer - you’re an artist, making beautiful code art, for viewing purposes only. You definitely aren’t writing programs for other people to use, whether you think you are or not.

                                  This philosophy (and the results) are why I switched away from dmenu to rofi - because rofi provided effort-saving features that were valuable to me out-of-the-box, while dmenu did not. (I used dmenu for probably a year - but it only took me a few minutes with rofi to realize its value and switch) rofi is more valuable as a tool, as a effort-saving device, than dmenu is, or ever will be.

                                  In other words - the suckless philosophy, when followed, actively makes computers less useful as tools (because, among other things, computers are communication and collaboration devices, and the suckless philosophy excludes large collaborative projects that meet the needs of many users). This is fine if your purpose is to make art, or to only fulfill your own needs - just make sure that you clearly state to other potential users that your software is not meant for them to make use of.

                                  Also note that the most-used (and useful) tools follow exactly the opposite of the suckless philosophy - Firefox, Chromium, Windows, Libreoffice, Emacs, VSCode, Syncthing, qmk, fish, gcc, llvm, rustc/cargo, Blender, Krita, GIMP, Audacity, OBS, VLC, KiCad, JetBrains stuff, and more - not just the ones that are easy to use, but also ones that are hard to use (Emacs, VSCode, Audacity, Blender, JetBrains) but meet everyone’s needs, and are extensible by themselves (as opposed to requiring integration with other simple CLI programs).

                                  There’s a reason for this - these programs are more useful to more people than anything suckless-like (or built using the Unix philosophy, which shares many of the same weaknesses). So, if you’re writing software exclusively for yourself, suckless is great - but if you’re writing software for other people (either as an open-source project, or as a commercial tool), it sucks.

                                  To top it off, people writing open-source software using the suckless philosophy aren’t contributing to non-suckless projects that are useful to other people - so the rest of the open-source community loses out, too.

                                  1. 2

                                    It’s fine to disagree with “the suckless philosophy”, but do please try to be nice to people gifting their work as FLOSS.

                                    1. 11

                                      The tone is not dissimilar from the language on the page it is replying to: https://suckless.org/philosophy/

                                      Many (open source) hackers are proud if they achieve large amounts of code, because they believe the more lines of code they’ve written, the more progress they have made. The more progress they have made, the more skilled they are. This is simply a delusion.

                                      Most hackers actually don’t care much about code quality. Thus, if they get something working which seems to solve a problem, they stick with it. If this kind of software development is applied to the same source code throughout its entire life-cycle, we’re left with large amounts of code, a totally screwed code structure, and a flawed system design. This is because of a lack of conceptual clarity and integrity in the development process.

                                      1. 1

                                        I wasn’t actually trying to copy the style of that page, but I guess that I kind of did anyway.

                                        @JoachimSchipper (does this work on Lobsters?) My reply wasn’t meant to be unkind to the developer. I’m frustrated, in the same sense that you might be frustrated with a fellow computer scientist who thought that requiring all functions to have a number of lines of code that was divisible by three would improve performance, an idea which is both obviously wrong, and very harmful if spread.

                                        …but I wasn’t trying to impart any ill will, malice, or personal attack toward FRIGN, just give my arguments for why the suckless philosophy is counter-productive.

                                    2. 1

                                      Some people insist on understanding how their tool works.

                                      I have a few friends like that - they are profoundly bothered by functionality they don’t understand and go to lengths to avoid using anything they deem too complex to learn.

                                      The rest of open source isn’t ‘losing out’ when they work elsewhere - they weren’t going to contribute to a ‘complex’ project anyways, because that’s not what they enjoy doing.

                                      1. -4

                                        Nice rant

                                        1. 5

                                          You throw advertising for suckless in the room as the true way, link to your website with the tone described here and then complain about responses ? That doesn’t reflect well on the project.

                                          1. 0

                                            What are you talking about? There is no one single true way, as it heavily depends on what your goals are (as always). If your goals align with the suckless philosophy, it is useful to apply it, however, if your goals differ, feel free to do whatever you want.

                                            I don’t feel like replying exhaustively to posts that are ad hominem, which is why I only (justifiably) marked it as a rant. We can discuss suckless.org’s projects in another context, but here it’s only about the suckless philosophy of simplicity and minimalism. I am actually surprised that people even argue for more software complexity, and SLOC as a measure of achievement is problematic. In no way does the suckless philosophy exclude you from submitting to other projects, and it has nothing to do with pretty code. You can do pretty complex things with well-structured and readable code.

                                            1. 4

                                              While you may find it insulting, I don’t think the comment is ad hominem at all. It’s not saying that the suckless philosophy is defective because of who created/adopts/argues for it. That would be an attack ad hominem. The comment is saying the suckless philosophy is defective because software that adheres to it is less useful than other software, some of which was then listed in the comment.

                                              While it’s (obviously) perfectly fine to feel insulted by a comment that says your philosophy produces less-than-useful software, that does not make the comment either a rant or an ad hominem attack.

                                              It’s a valid criticism to say that over-simplifying software creates more human work, and a valid observation that software which creates more human work doesn’t do what software is meant to do. And you dismissed that valid criticism with “Nice rant.” If you don’t feel like responding exhaustively, no one can blame you. But I view “nice rant” as a less-than-constructive contribution to the conversation, and it appears that other participants do as well.

                                              1. 2

                                                My post was neither a rant, nor ad-hominem. You should be a bit more thoughtful before applying those labels to posts.

                                                It’s clearly not an ad-hominem because nowhere in my post did I attack your character or background, or use personal insults. My post exclusively addressed issues with the suckless philosophy, which is not a person, and is definitely not you.

                                                It’s also (more controversially) probably not a rant, because it relies on some actual empirical evidence (the list of extremely popular programs that do the opposite of the suckless philosophy, and the lack of popularity among programs that adhere to it), is well-structured, is broken up into several points that you can refute or accept, had parts of it re-written several times to be more logical and avoid the appearance of attacks on your person, and takes care to define terminology and context and not just say things like “suckless is bad” - for instance, the sentence “So, if you’re writing software exclusively for yourself, suckless is great - but if you’re writing software for other people (either as an open-source project, or as a commercial tool), it sucks” specifically describes the purpose for which the philosophy is suboptimal.

                                                It also had at least three orders of magnitude more effort put into it than your two-word dismissal.

                                            2. 4

                                              Rather than labeling it a rant, wouldn’t it be more persuasive to address some of the points raised? Can you offer a counterpoint to the points raised in the first or last paragraphs?

                                              1. 2

                                                It is definitely a rant, as the points adressed are usually based on the question which norms you apply. Is your goal to serve as many people as possible, or is your goal to provide tools that are easy to combine for those willing to learn how to?

                                                In an ideal world computer users would, when we consider the time they spend with them, strive to understand the computer as a tool to be learnt. However, it becomes more and more prevalent that people don’t want to and expect to be supported along the way for everything non-trivial. We all know many cases where OSS developers have quit over this, because the time-demand of responding to half-assed PRs and feature-requests is too high.

                                                @fouric’s post went a bit ad-hominem regarding “software engineers” vs. “artists”, but I smiled about it as my post probably hit a nerve somewhere. I would never call myself a “software engineer”.

                                                Hundreds of thousands of people are using suckless software, knowingly or unknowingly. Calling us “artists” makes it sound like we were an esolang-community. We aren’t. We are hackers striving for simplicity.

                                                I could also go into the security tangent, but that should already be obvious.

                                                1. 1

                                                  It is definitely a rant, as the points adressed are usually based on the question which norms you apply.

                                                  …norms which I addressed - I was very clear to describe what purposes the suckless philosophy is bad at fulfilling. That is - my comment explicitly describes the “norms” it is concerned with, so by your own logic, it is not a rant - and certainly contains more discussion than any of your further replies, who do nothing to refute any of its points.

                                                  @fouric’s post went a bit ad-hominem regarding “software engineers” vs. “artists”

                                                  Nor is there any ad-hominem contained in the comment, and certainly nothing around my differentiation between “software engineers” and “artists” - nowhere did I make any personal attack on your “character or motivations”, or “appeal to the emotions rather than to logic or reason”, as a quick online search reveals for the phrase “ad-hominem”.

                                                  I smiled about it as my post probably hit a nerve somewhere.

                                                  If you want to refute any of the points I made, you’re welcome to do it. However, if you just want to try to arouse anger and smile when you do so, I’ll ask you to stop responding to my posts, especially with a two-word dismissal that doesn’t address any of my arguments.

                                                  I would never call myself a “software engineer”

                                                  …then it should be even more clear that my post was not a personal attack or ad-hominem against you. “You keep using that word…”

                                          2. 4

                                            How does the suckless philosophy interact with feature flags?

                                            1. 2

                                              Suckless would say you should only have features you use, and instead of feature flags you have optionally-applied patches

                                        1. 9

                                          This is a very interesting exploration of designing a non-cryptographic PRNG for modern x64.

                                          That said, non-cryptographic random is a specialized algorithm for some particularly CPU-heavy code; modern crypto is fast enough that one should just use crypto-quality random by default. ChaCha, commonly used as the basis for cryptographic PRNGs, runs at [approximately 4 cycles-per-byte|https://en.wikipedia.org/wiki/Salsa20], and anything e.g. generating session cookies for the web will swamp the ChaCha overhead (and probably needs its superior security guarantees).

                                          1. 3

                                            I’d never want to run in 127.1/16-127.255/16; as the authors acknowledge,

                                            Since deployed implementations’ willingness to accept 127/8 addresses as valid unicast addresses varies, a host to which an address from this range has been assigned may also have a varying ability to communicate with other hosts.

                                            1. 2

                                              This is pretty amazing! The CPU core, admittedly without register file, takes less logic than many of my simple cores that perform much more mundane tasks.

                                              The presented applications, i.e. deeply embedded CPUs are also intriguing. A difference to most other deeply embedded processors is that RISC-V is a pretty generous architecture compared to e.g. an 8051 or a TMS1000, with many and wide registers. I am not sure, though, that the benefits can be upheld when looking at the processor paired with a memory, I/O and program storage.

                                              Nevertheless, a really cool project.

                                              1. 2

                                                Yes; to put a number to it, the first linked introductory video shows claims to implement the core on an Xilinx Artix 7 for 130 LUTs plus 206 Flip-Flops, plus memory (but the register file can be stored in the same block RAM holding code and data memory.)

                                                1. 2

                                                  Thanks for pointing out that the code/data memory and the register file can be colocated. This is certainly attractive to keep the whole system resource usage down.

                                                  I just realized that the external register file, especially if done with block RAM, lends itself to implement hardware supported threads or something similar to hyperthreading. By changing a base pointer into the RAM, the CPU can have multiple register files, with one of them being active at any given time.

                                                  1. 2

                                                    Many of the applications for a CPU like this don’t need any state outside of the CPU registers – especially as RISC-V lets you do multiple levels of subroutine call without touching RAM if you manually allocate different registers and a different return address register for each function (which means programming in asm not C). A lot of 8051 / PIC / AVR have been sold without any RAM (or with RAM == memory mapped registers)

                                              1. 5

                                                This is an interesting article, but some things that stood out to me:

                                                • I’d have liked to see the article engage more with the libc case. The article points out several arguments against statically linking libc, specifically - it’s the only supported interface to the OS on most *nix-ish systems that are not Linux, it’s a very widely used library (so disk/memory savings can be substantial), and as part of the system it should be quite stable. Also, Go eventually went back to (dynamically) linking against libc rather than directly making syscalls [EDIT: on many non-Linux systems]
                                                • LLVM IR is platform-specific and not portable. And existing C code using e.g. #ifdef __BIG_ENDIAN__ (or using any system header using such a construction!) is not trivial to compile into platform-independent code. Yes, you can imagine compiling with a whole host of cross-compilers (and system headers!), but at some point just shipping the source code begins looking rather attractive…
                                                • exploiting memory bugs is a deep topic, but the article is a bit too simplistic in its treatment of stack smashing. There’s a lot to dislike about ASLR, but ASLR is at least somewhat effective against - to think up a quick example - a dangling pointer in a heap-allocated object being use(-after-free)d to overwrite a function / vtable pointer on the stack.
                                                • in general, there’s a lot of prior art that could be discussed; e.g. I believe that Windows randomizes (randomized?) a library’s address system-wide, rather than per-process.
                                                1. 3

                                                  And existing C code using e.g. #ifdef __BIG_ENDIAN__ (or using any system header using such a construction!) is not trivial to compile into platform-independent code.

                                                  The article addresses this.

                                                  I would also note as an aside that any code utilising #ifdef __BIG_ENDIAN__ is just plain wrong. Yes, even /usr/include/linux/tcp.h. Just don’t do that. Write the code properly.

                                                  1. 2

                                                    I’ll bite. I have written a MC6809 emulator that makes the following assumptions of the host system:

                                                    • A machine with 8-bit chars (in that it does support uint8_t)

                                                    • A 2’s complement architecture

                                                    I also do the whole #ifdef __BIG_ENDIAN__ (effectively), check out the header file. How would you modify the code? It’s written that way to a) make it easier on me to understand the code and b) make it a bit more performant.

                                                    1. 3

                                                      I would write inline functions which would look like:

                                                      static inline mc6809byte__t msb(mc6809word__t word) { return (word >> 8) & 0xff; }
                                                      static inline mc6809byte__t lsb(mc6809word__t word) { return word & 0xff; }
                                                      

                                                      And I would use those in place of the current macro trick of replacing something->A with something->d.b[MSB] etc.

                                                      I don’t think this would significantly impact readability. Clang seems to produce identical code for both cases, although from a benchmark there’s still some minor (10% if your entire workload is just getting data out of the MSB and LSB) performance impact although this may be some issue with my benchmark. gcc seems to struggle to realize that they are equivalent and keeps the shr but the performance impact is only 13%.

                                                      If you need to support writing to the MSB and LSB you would need a couple more inline functions:

                                                      static inline set_msb(mc6809word__t *word, mc6809byte__t msb) { *word = lsb(*word) | ((msb & 0xff) << 8); }
                                                      static inline set_lsb(mc6809word__t *word, mc6809byte__t lsb) { *word = (msb(*word) << 8) | (lsb & 0xff); }
                                                      

                                                      I haven’t benchmarked these.

                                                      I think the point I would make is that you should benchmark your code against these and see whether there is a real noticeable performance impact moving from your version to this version. Through this simple change you can make steps to drop your assumption of CHAR_BIT == 8 and your code no longer relies on type punning which may or may not produce the results you expect depending on what machine you end up on. Even though your current code is not doing any in-place byte swapping, you still risk trap representations.

                                                      P.S. *_t is reserved for type names by POSIX.

                                                      1. 2

                                                        I would definitely have to benchmark the code, but I can’t see it being better than what I have unless there exists a really magical C compiler that can see through the shifting/masking and replace them with just byte read/writes (which is what I have now). Your set_lsb() function is effectively:

                                                        *word = ((*word >> 8) & 0xff) << 8 | lsb & 0xff;
                                                        

                                                        A 10-13% reduction in performance seems a bit steep to me.

                                                        you still risk trap representations.

                                                        I have to ask—do you know of any computer sold new today that isn’t byte oriented and 2’s complement? Or hell, any machine sold today that actually has a trap representation? Because I’ve been programming for over 35 years now, and I have yet to come across one machine that a) isn’t byte oriented; b) 2’s complement; c) has trap representations. Not once. So I would love to know of any actual, physical machines sold new that breaks one of these assumptions. I know they exist, but I don’t know of any that have been produced since the late 60s.

                                                        1. 2

                                                          a really magical C compiler that can see through the shifting/masking and replace them with just byte read/writes

                                                          Yes, that’s what clang did when I tested it on godbolt. In fact I can get it to do it in all situations by swapping the order of the masking and the shifting.

                                                          Here’s the result:

                                                                                  output[i].lower = in & 0xff;
                                                            4011c1:       88 54 4d 00             mov    %dl,0x0(%rbp,%rcx,2)
                                                                                  output[i].upper = (in & 0xff00) >> 8;
                                                            4011c5:       88 74 4d 01             mov    %dh,0x1(%rbp,%rcx,2)
                                                          

                                                          You underestimate the power of compilers, although I’m not sure why gcc can’t do it, it’s really a trivial optimisation all things considered.

                                                          I just checked further and it seems the only reason that the clang compiled mask&shift variant performs differently is because of different amounts of loop unrolling and also because the mask&shift code uses the high and low registers instead of multiple movbs. The godbolt code didn’t use movbs, it was identical for both cases in clang.

                                                          My point being that in reality you may get 10% (absolute worst case) difference in performance just because the compiler felt like it that day.

                                                          I have to ask—do you know of any computer sold new today that isn’t byte oriented and 2’s complement? Or hell, any machine sold today that actually has a trap representation?

                                                          I don’t personally keep track of the existence of such machines.

                                                          For me it’s not about “any machine” questions, it’s about sticking to the C abstract machine until there is a genuine need to stray outside of that, a maybe 13% at absolute unrealistic best performance improvement is not worth straying outside the definitions of the C abstract machine.

                                                          In general I have found this to produce code with fewer subtle bugs. To write code which conforms to the C abstract machine you just have to know exactly what is well defined. To write code which goes past the C abstract machine you have to know with absolute certainty about all the things which are not well defined.

                                                          edit: It gets worse. I just did some benchmarking and I can get swings of +-30% performance by disabling loop unrolling. I can get both benchmarks to perform the same by enabling and disabling optimization options.

                                                          This is a tight loop doing millions of the same operation. Your codebase a lot more variation than that. It seems more likely you’ll get a 10% performance hit/improvement by screwing around with optimisation than you will by making the code simply more correct.

                                                  2. 2

                                                    Wait, Go dynamically links to libc now? Do you have more details? I thought Go binaries have zero dependencies and only the use of something like CGO would change that.

                                                    1. 15

                                                      As the person responsible for making Go able to link to system libraries in the first places (on illumos/Solaris, others later used this technology for OpenBSD, AIX and other systems), I am baffled why people have trouble understanding this.

                                                      Go binaries, just like any other userspace binary depend at least on the operating system. “Zero dependency” means binaries don’t require other dependencies other than the system itself. It doesn’t mean that the dependency to the system cannot use dynamic linking.

                                                      On systems where “the system” is defined by libc or its equivalent shared library, like Solaris, Windows, OpenBSD, possibly others, the fact that Go binares are dynamically linked with the system libraries doesn’t make them not “zero dependency”. The system libraries are provided by the system!

                                                      Also note that on systems that use a shared library interface, Go doesn’t require the presence of the target shared library at build time, only compiled binaries require it at run time. Cross-compiling works without having to have access to target system libraries. In other words, all this is an implementation detail with no visible effect to Go users, but somehow many Go users think this is some kind of a problem. It’s not.

                                                      1. 3

                                                        I (clarified) was thinking about e.g. OpenBSD, not Linux; see e.g. cks’ article.

                                                        1. 2

                                                          Under some circumstances Go will still link to libc. The net and os packages both use libc for some calls, but have fallbacks that are less functional when CGO is disabled.

                                                          1. 6

                                                            The way this is usually explained is a little bit backwards. On Linux, things like non-DNS name resolution (LDAP, etc), are under the purview of glibc (not any other libc!) with its NSCD protocol and glibc-specific NSS shared libraries.

                                                            Of course that if you want to use glibc-specific NSS, you have to to link to glibc, and of course that if you elect not to link with glibc you don’t get NSS support.

                                                            Most explanations of Go’s behavior are of the kind “Go is doing something weird”, while the real weirdness is that in Linux name resolution is not something under of the purview of the system, but of a 3rd party component and people accept this sad state of affairs.

                                                            1. 3

                                                              How is glibc a 3rd party component on, say, Debian? Or is every core component 3rd party since Debian does not develop any of them?

                                                              1. 4

                                                                Glibc is a 3rd party component because it is not developed by the first party, which is the Linux developers.

                                                                Glibc sure likes to pretend it’s first party though. It’s not, a fact simply attested by the fact other Linux libc libraries exists, like musl.

                                                                Contrast that with the BSDs, or Solaris, or Windows, where libc (or its equivalent) is a first party component developed by BSDs, Solaris, or Windows developers.

                                                                I would hope that Debian Linux would be a particular instance of a Linux system, rather than an abstract system itself, and I could use “Linux software” on it but glibc’s portability quirks and ambitions of pretending to be a 1st party system prevent one from doing exactly that.

                                                                Even if you think that Linux+glibc should be an abstract system in itself, distrinct from, say, pure Linux, or Linux+musl, irrespective of the pain that would instil on me, the language developer, glibc is unfeasible as an abstract interface because it is not abstract.

                                                                1. 3

                                                                  Wait, how are the Linux developers “first party” to anything but the kernel?

                                                                  I would hope that Debian Linux would be a particular instance of a Linux system

                                                                  There’s no such thing as “a Linux system” only “a system that uses Linux as a component”. Debian is a system comparable to FreeBSD, so is RHEL. Now some OS are specifically derived from others, so you might call Ubuntu “a Debian system” and then complain that snap is an incompatible bolt on or something (just an example, not trying to start an argument about snap).

                                                                  1. 5

                                                                    Of course there is such a thing as a Linux system, you can download it from kernel.org, it comes with a stable API and ABI, and usually, in fact with the exception of NSS above, absolutely always does exactly what you want from it.

                                                                    Distributions might provide some kind of value for users, and because they provide value they overestimate their technical importance with silly statements like “there is no Linux system, just distributions”, no doubt this kind of statement comes from GNU itself, with its GNU+Linux stance, but from a language designer none of this matters at all. All that matter are APIs and ABIs and who provides them. On every normal system, the developer of the system dictates and provides its API and ABI, and in the case of Linux that’s not different, Linux comes with its stable API and ABI and as a user of Linux I can use it, thank you very much. The fact that on Linux this ABI comes though system calls, while on say, Solaris, comes from a shared library is an implementation detail. Glibc, a 3rd party component comes with an alternative API and ABI, and for whatever reason some people think that is more canonical than the first party API and ABI provided by the kernel itself. The audacity of glibc developers claiming authority over such a thing is unbelievable.

                                                                    As a language developer, and in more general as an engineer, I work with defined systems. A system is whatever has an API and an ABI, not some fuzzy notion defined by some social organization like a distribution, or a hostile organization like GNU.

                                                                    As an API, glibc is valuable (but so is musl), as an ABI glibc has negative value both for the language developer and for its users. The fact that in Go we can ignore glibc, means not only freedom from distributions’ and glibc’s ABI quirks, but it also means I can have systems with absolutely no libc at. Just a Linux kernel and Go binaries, a fact that plenty of embedded people make use of.

                                                                    1. 2

                                                                      Of course there is such a thing as a Linux system, you can download it from kernel.org,

                                                                      So is libgpiod part of the system too? You can download that from kernel.org as well. You can even download glibc there.

                                                                      it comes with a stable API and ABI, and usually, in fact with the exception of NSS above, absolutely always does exactly what you want from it.

                                                                      Unless you want to do something other than boot :)

                                                        2. 2

                                                          I’d have liked to see the article engage more with the libc case.

                                                          Fair, though I feel like I engaged with it a lot. I even came up with ideas that would make it so OS authors can keep their dynamically-linked libc while not causing problems with ABI and API breaks.

                                                          What would you have liked me to add? I’m not really sure.

                                                          LLVM IR is platform-specific and not portable.

                                                          Agreed. I am actually working on an LLVM IR-like alternative that is portable.

                                                          And existing C code using e.g. #ifdef BIG_ENDIAN (or using any system header using such a construction!) is not trivial to compile into platform-independent code.

                                                          This is a good point, but I think it can be done. I am going to do my best to get it done in my LLVM alternative.

                                                          the article is a bit too simplistic in its treatment of stack smashing.

                                                          Fair; it wasn’t the big point of the post.

                                                          There’s a lot to dislike about ASLR, but ASLR is at least somewhat effective against - to think up a quick example - a dangling pointer in a heap-allocated object being use(-after-free)d to overwrite a function / vtable pointer on the stack.

                                                          I am not sure what your example is. Could you walk me through it?

                                                          1. 3

                                                            Agreed. I am actually working on an LLVM IR-like alternative that is portable.

                                                            This is not possible for C/C++ without either:

                                                            • Significantly rearchitecting the front end, or
                                                            • Effectively defining a new ABI that is distinct from the target platform’s ABI.

                                                            The second of these is possible with LLVM today. This is what pNaCl did, for example. If you want to target the platform ABI then the first is required because C code is not portable after the preprocessor has run. The article mentions __BIG_ENDIAN__ but that’s actually a pretty unusual corner case. It’s far more common to see things that conditionally compile based on pointer size - UEFI bytecode tried to abstract over this and attempts to make Clang and GCC target it have been made several times and failed and that was with a set of headers written to be portable. C has a notion of an integer constant expression that, for various reasons, must be evaluated in the front end. You can make this symbolic in your IR, but existing front ends don’t.

                                                            The same is true of C++ templates, where it’s easy to instantiate a template with T = long as the template parameter and then define other types based on sizeof(T) and things like if constexpr (sizeof(T) < sizeof(int)), at which point you need to preserve the entire AST. SFINAE introduces even more corner cases where you actually need to preserve the entire AST and redo template instantiation for each target (which may fail on some platforms).

                                                            For languages that are designed to hide ABI details, it’s easy (see: Java or CLR bytecode).

                                                            1. 2

                                                              I believe you are correct, but I’ll point out that I am attempting to accomplish both of the points you said would need to happen, mostly with a C preprocessor with some tricks up its sleeve.

                                                              Regarding C++ templates, I’m not even going to try.

                                                              I would love to disregard C and C++ entirely while building my programming language, but I am not disregarding C at least because my compiler will generate C. This will make my language usable in the embedded space (there will be a -nostd compiler flag equivalent), and it will allow the compiler to generate its own C source code, making bootstrap easy and fast because unlike Rust or Haskell, where you have to follow the bootstrap chain from the beginning, my compiler will ship with its own C source, making bootstrap as simple as:

                                                              1. Compile C source.
                                                              2. Compile Yao source. (Yao is the name of the language.)
                                                              3. Recompile Yao source.
                                                              4. Ensure the output of 2 and 3 match.

                                                              With it that easy, I hope that packagers will find it easy enough to do in their packages.

                                                              1. 2

                                                                If your input is a language that doesn’t expose any ABI-specific details and your output is C (or C++) code that includes platform-specific things, then this is eminently tractable.

                                                                This is basically what Squeak does. The core VM is written in a subset of Smalltalk that can be statically compiled to C. The C code can then be compiled with your platform’s favourite C compiler. The rest of the code is all bytecode that is executed by the interpreter (or JIT with Pharo).

                                                            2. 3

                                                              Agreed. I am actually working on an LLVM IR-like alternative that is portable.

                                                              If you’re looking for prior art, Tendra Distribution Format was an earlier attempt at a UNCOL for C.

                                                              1. 1

                                                                Thank you for the reference!

                                                              2. 3

                                                                With respect to libc - indeed, the article engaged quite a bit with libc! That’s why I was waiting for a clear conclusion.

                                                                E.g. cosmopolitan clearly picks “all the world’s a x86_64, but may be running different OSes”, musl clearly picks “all the world’s Linux, but may be running on different architectures”. You flirt with both “all the world’s Linux” and something like Go-on-OpenBSD’s static-except-libc. Which are both fine enough.

                                                                With respect to ASLR: I agree that this isn’t the main point of your article, and I don’t think I explained what I meant very well. Here’s some example code, where the data from main() is meant to represent hostile input; I mean to point out that merely segregating arrays and other data / data and code doesn’t fix e.g. lifetime issues, and that ASLR at least makes the resulting bug a bit harder to exploit (because the adversary has to guess the address of system()). A cleaned-up example can be found below, (actually-)Works-For-Me code here.

                                                                static void buggy(void *user_input) {
                                                                    uintptr_t on_stack_for_now;
                                                                    /* Bug here: on_stack_for_now doesn't live long enough! */
                                                                    scheduler_enqueue(write_what_where, user_input, &on_stack_for_now);
                                                                }
                                                                
                                                                static void victim(const char *user_input) {
                                                                    void (*function_pointer)() = print_args;
                                                                
                                                                    if (scheduler_run() != 0)
                                                                        abort();
                                                                
                                                                    function_pointer(user_input);
                                                                }
                                                                
                                                                int main(void) {
                                                                    buggy((void *)system);
                                                                    victim("/bin/sh");
                                                                }
                                                                
                                                                1. 2

                                                                  With respect to libc - indeed, the article engaged quite a bit with libc! That’s why I was waiting for a clear conclusion.

                                                                  I see. Do you mean a conclusion as to whether to statically link libc or dynamically link it?

                                                                  I don’t think there is a conclusion, just a reality. Platforms require programmers to dynamically link libc, and there’s not much we can do to get around that, though I do like the fact that glibc and musl together give us the choice on Linux.

                                                                  However, I think the conclusion you are looking for might be that it does not matter because both suck with regards to libc!

                                                                  If you statically link on platforms without a stable syscall ABI, good luck! You can probably only make it work on machines with the same OS version.

                                                                  If you dynamically link, you’re probably going to face ABI breaks eventually.

                                                                  So to me, the conclusion is that the new ideas I gave are necessary to make working with libc easier on programmers. Right now, it sucks; with my ideas, it wouldn’t (I hope).

                                                                  Does that help? Sorry that I didn’t make it more clear in the post.

                                                                  Regarding your code, I think I get it now. You have a point. I tried to be nuanced, but I should not have been.

                                                                  I am actually developing a language where lifetimes are taken into account while also having separated stacks. I hope that doing both will either eliminate the possibility of such attacks or make them infeasible.

                                                            1. 2

                                                              This sounds like such an obvious thing but we keep seeing this happen over and over again: users write code that uses libcurl functions but they don’t check the return codes.

                                                              I have to wonder how many times Rust’s “must use” warnings have saved my butt. This is an easy thing to miss once in a while.

                                                              For those who aren’t familiar with Rust, it will yell at you if you don’t do anything with the “Result” return type, which may indicate an error. It’s a warning, not an error, but still very helpful.

                                                              1. 1

                                                                gcc (and llvm, and…) has __attribute__((warn_unused_result)) for the same purpose.

                                                                1. 2

                                                                  Which are still helpful even on functions where you might want to use the result only say 95% of the time, since you can write a call like void someFunction(); to explicitly notify the compiler that you’re ignoring the result on purpose.

                                                              1. 1

                                                                For the specific example given, you may want to consider using Ascii85 instead of Base64? SNS apparently supports at least 0x20..0x7f, and you could even swap 0x7f = DEL for e.g. 0x0a = \n if you want printable characters, so…

                                                                1. 3

                                                                  The directory browser that ships with vim is not particularly intuitive and ships with a wealth of features I will most likely never use. I get the sense that many developers just blindly install a shiny plugin without understanding what netrw can do. Sure, netrw is not perfect but less dependencies in my life and striving for simplicity is a good thing.

                                                                  It’s okay to use plugins if it’ll make your life better, even if you can manage to the same things in an unintuitive way without it.

                                                                  1. 4

                                                                    Clearly, yes, but there’s also value in keeping your configuration small and easy-to-maintain. Also, in being able to use computers you haven’t extensively customized.

                                                                    I know that you lean heavily into “programming” your environment with AutoHotKey etc., and I know there are advantages to doing it that way. Me, I started out doing sysadmin stuff, so being able to use a “stock” computer was important to me - and I’m trying to keep maintenance work down so that I can do more “fun” stuff.

                                                                    1. 3

                                                                      Oh, totally agreed! There is significant value in keeping your setup minimalist, and I know I’ve given up a lot of good stuff by customizing so heavily. Whenever I have to SSH into an AWS instance it’s miserable. And I can only afford to customize so much because I’m independent and use only one computer.

                                                                      I mostly push back because a lot of online vim communities turn it into almost of a moral thing: ultraminimalism is The Right Way and heavily customizing is Wrong. You see this sentiment a lot on r/vim, esp with -romainl-, and to my understanding it’s also infected the vim IRC. The world is big enough for both of us.

                                                                      1. 1

                                                                        ultraminimalism is The Right Way and heavily customizing is Wrong

                                                                        Not specific to vim, but there is a big advantage in discouraging customisation among your core userbase: it encourages you to get the defaults right. This is a problem that a lot of open source programs suffer from: you end up with a load of customisation potential and all of the long-term users have a setup that is very different from new users. New users struggle to get the thing to work because they haven’t accumulated the massive set of configuration options that everyone who uses the program seriously has built up over the years.

                                                                        I keep reading about how amazing neovim is, but when I spent half an hour trying to follow some instructions from someone who had an (apparently) amazing config and gave up. It doesn’t matter to me that it can do all of these amazing things if I can’t figure out how to make it to do any of them.

                                                                        1. 1

                                                                          Isn’t that more of a second-order knowledge gathering (project level) and/or experience transfer problem (social level)? The developer has provided a set of features believed to be useful or at least interesting, yet does not follow up to learn how they are actually used or baked into a workflow as a step to provide presets that can be added to/subtracted from and taught. The users react to this by creating ad-hoc exchanges with low visibility/poor discovery/upstream desynchronisation and a lot of other problems.

                                                                  1. 7

                                                                    We already have various tools for enabling growth: the freedom to use the software for any purpose being one of the most powerful.

                                                                    This has been tried for the last 30 years; with very limited success, and the success it did have is probably not primarily attributable to “Software Freedom” either.

                                                                    Essentially this article is “keep doing what we’ve been doing for the last 30 years”, which doesn’t strike me as a very good strategy.

                                                                    1. 6

                                                                      with very limited success

                                                                      Hobbyist software hasn’t exactly displaced end-user software, but the major browsers are (corporate) open-source and pretty much all developer tooling and infrastructure is. I’d say open source has been very successful!

                                                                      1. 7

                                                                        Hobbyist software hasn’t exactly displaced end-user software, but the major browsers are (corporate) open-source and pretty much all developer tooling and infrastructure is. I’d say open source has been very successful!

                                                                        Ok but how much browser hacking have you done? How many times have you opened up Blender’s source code – or Chromium’s – and gone “ok I can make a change to this within a reasonable timeframe to make it do what I want”.

                                                                        Free as in freedom doesn’t mean shit when codebases are so large as to be inscrutable and induplicable.

                                                                        1. 10

                                                                          Something as complex as a modern web browser is going to have complex source code. There’s no way around that, and it has nothing to do with being open source or not.

                                                                          1. 4

                                                                            Sometimes those codebases are large because they actually do something useful.

                                                                            I will say I’ve gone in on complex projects before (usually language runtimes) when needed/desired and been able to figure out changes. Usually, I don’t because I have no need.

                                                                            1. 2

                                                                              On one hand, there are two hobbyist forks of Chromium that I use: Ungoogled Chromium (for desktop) and Bromite (for Android). These are both Chromium with no google services. That these projects are possible is a success of open source.

                                                                              On the other hand, In order to have an ecosystem that fully embodies the spirit and values of free software, we need to embrace the OP’s recommendation, and go well beyond that. Software needs to be much simpler. Languages need to be much simpler. Hypercomplex mega-apps need to be replaced by an ecology of small tools that interoperate on a shared data model. (We had that in the Unix CLI, but then we threw that away when we permitted Apple and Microsoft to define how a GUI works.) Operating systems need to allow you to trivially examine and modify any code that you happen to be running, much more like Smalltalk or a Lisp machine, and much less like the C + Unix model that the community adopted back in the 1980’s.

                                                                            2. 3

                                                                              Hobbyist software hasn’t exactly displaced end-user software

                                                                              Linux was originally a hobby OS, so, at least “software that started as hobbyist” did displace end-user everywhere on internet servers.

                                                                              1. 2

                                                                                It’s been very successful at developer tooling. Which makes sense, since the whole ‘you can modify and change it as you please’ ethic lends itself naturally towards developers who can make those changes. It’s failed pretty hard at end-user software, with browsers being one of the only exceptions I can think of.

                                                                                1. 2

                                                                                  I guess one thing is that end-users often are unaware things might be changeable. Many still view computing as a black box that only the “gifted” can bend to their will. Everyone has a “handy cousin” who can fix their printer, but not everyone has a handy cousin who can change the program for them to do as they will.

                                                                                  And besides, if someone would ask me to create what is in essence a private fork of a program, I would be very hesitant at doing so: I would have to maintain that until the end of days, and somehow incorporate security updates into it. No, thanks!

                                                                                  And paying a company to do the same would get prohibitively expensive for most people, especially if it’s just a small seemingly trivial change like, I dunno, “could you change it so that the menu bar is at the bottom rather than at the top?”

                                                                                  Of course the more savvy users would end up making implementation tickets in the ticket trackers of the projects they care about, but I doubt the small free software projects would be able to deal with the large inflow of such “trivial” requests without patches, or even the certainty people would be able to verify the new version does what they want. Dev: “can you compile this most recent version and test it?” User: “what does ‘compile’ mean?”

                                                                                2. 1

                                                                                  I’d say open source has been very successful!

                                                                                  The point of this article is that it’s been successful in a way that hasn’t meaningfully accomplished much in terms of user empowerment. Open source gave us the browser ecosystem which has been a very powerful force for allowing tech companies to deliver new products (which is exactly why open source exists) but it’s succeeded only by redirecting goodwill away from the user-centric free software movement towards the corporate-friendly goals of open source. So we have these incredibly powerful browsers that are borderline impossible for anyone who doesn’t work at Google/Apple/Mozilla to contribute to. We have mobile phones which have impressive computing power and are largely technically open source, but they spy on us and don’t allow us to do much to protect against hostile corporate behavior even on our own devices.

                                                                                  By focusing on technologies which are intentionally anti-scale, we may be able to redirect some of that lost momentum from the last couple decades back to technologies which can put the user in control and respect consent and autonomy.

                                                                                  1. 1

                                                                                    But is this attributable to “software freedom” or something else?

                                                                                    How many people use Chrome because it’s Free Software? I’d say very very few.

                                                                                1. 3

                                                                                  People joke about how we’re now going to need 128-bit integers…well, anyone who works with IPv6 addresses or UUIDs loves 128-bit integers.

                                                                                  1. 3

                                                                                    Yeah but IPv6 addresses and UUIDs are really opaque blobs of bits. You’re not doing arithmetic on them. Bitmasking for IPv6, maybe.

                                                                                    1. 2

                                                                                      You can do 64-bit multiplication without UB overflow if you have 128-bit integers. Then you can more easily and without UB check for overflow.

                                                                                      1. 4

                                                                                        This seems like a strange way to check for overflow. Frankly, C should just have built-in checked arithmetic intrinsics. Rust got this right.

                                                                                        1. 1

                                                                                          I agree.

                                                                                          1. 1

                                                                                            gcc does have built-in checked arithmetic. Standard C doesn’t have add-with-overflow, but gcc alone is more portable than rustc.

                                                                                            1. 2

                                                                                              Yeah, gcc and clang have intrinsics.

                                                                                              Fair point about Rust.

                                                                                              Just checked, and Zig also has intrinsics for this, so at least newer languages are learning from the lack of these in the C standard.

                                                                                              1. 1

                                                                                                Swift traps on overflow by default; if you want values to wrap, you use &+ &- etc operators.

                                                                                      2. 1

                                                                                        32-bit GCC doesn’t provide 128-bit integers, which complicates these things indeed.

                                                                                        1. 1

                                                                                          however, on a RV128I you just need a long ;)

                                                                                      1. 1

                                                                                        Wow, that’s pretty nasty. The attacker is essentially spoofing the DHCP server’s response, which the DHCP client accepts because the client’s XID is generated by an inadequately-seeded RNG. Since e.g. the public key of the root user is ultimately fetched from a server configured via DHCP (with no further authentication, etc.), pwnage follows.

                                                                                        Nice find!

                                                                                        1. 2

                                                                                          Your plan to have decent backups is good; your decision to have only three is baffling.

                                                                                          1. 1

                                                                                            I figure: what would I ever need to do with a fourth backup? Just seems paranoid. I’ll almost certainly always restore from the most recent anyways. Plus, I can always increase it in the future (and am open to changing it now, just can’t think of why I would).

                                                                                            1. 4

                                                                                              One real benefit to using tarsnap, specifically, is that tarsnap will deduplicate backed-up data across all backups. Deduplication doesn’t really help you if your server stores a rotating set of huge movies, but if you have a slowly-growing set of data, keeping old backups around is pretty much free.

                                                                                              1. 2

                                                                                                My backup scheme is:

                                                                                                • daily-01 to daily-31, these get overwritten as days progress.
                                                                                                • YYYY-MM, which never get overwritten.

                                                                                                The monthly snapshots are done on the first of every month.

                                                                                            1. 6

                                                                                              I think the fact that SSH’s wire protocol can only take single string as a command is a huge, unrecognized design flaw. It would be so much better to take an array of command name plus arguments, just like the underlying syscalls. The distinction is similar to Dockerfile giving the choice between a command as string vs. array, where the string gets passed to a shell, and the array doesn’t.

                                                                                              1. 8

                                                                                                This was a long time ago and I wasn’t doing much system administration then on account of being way too young to hold a job (well, I’m not doing much of that now either but back then I was doing even less of it…) so I might be misremembering it. Please take this with a grain of salt – perhaps someone who did more Unix back then remembers this better? By the time I started doing anything with Unix, rsh was pretty much a relic.

                                                                                                I think that this is not specifically SSH’s design, I think it’s deliberately made to be identical to rsh’s. That was actually a very clever design choice – ssh was specifically meant to replace rsh, which was 10+ years at the time, so coming up with a tool that could run 10 years’ worth of scripts was important, otherwise no one would’ve wanted it. It was a very efficient solution to a non-technical problem.

                                                                                                I suppose it would be possible to add a second syntax today, similar to Docker’s. I don’t know if that would be a good idea. IMHO it wouldn’t be but I think I’m prooobably not in the target demographics for these things ¯_(ツ)_/¯

                                                                                                Edit: oh yeah, if anyone’s wondering what arcane incantation from the ancient lore of the Unix greybeards is required to do that, just do:

                                                                                                echo "cd /tmp ; pwd" | ssh user@machine.local
                                                                                                

                                                                                                There are probably other/better ways, this one’s just my favourite.

                                                                                                FWIW, this one comes naturally to me, it’s the Docker way that “feels” weird to me. I don’t think one’s objectively better than the other, it’s just that I’d been doing it this way (with many other tools, not just SSH) for many years by the time Docker showed up. (Then again, it makes a lot of sense in Docker’s context, and Docker didn’t have to be a drop-in replacement for anything other than “works on my machine”, so I think they did the right thing)

                                                                                                1. 2

                                                                                                  I appreciate the point, but one minor warning: unlike ssh user@machine command, your echo command | ssh user@machine does allocate a pty for ssh (unless you use -T), which may affect the behaviour of some commands.

                                                                                                  1. 3

                                                                                                    Oh, yeah, this is a useful thing to remember! I think newer versions are (at least on some systems?) smart enough to figure out not to allocate a pty if stdin isn’t an actual terminal but I think that’s a whole other minefield. The reason why I usually don’t add -T explicitly is weird muscle/brain memory trained on a couple of somewhat restricted use cases.

                                                                                                    I can’t edit my post anymore so here’s hoping nobody reads this thread halfway in the future :-D.