Threads for Forty-Bot

  1. 8

    I’m only on github to make PRs. A important step for me was to consciously avoid the dopamine triggers by removing all my stars and follows, and making my profile page as boring and inconsequential as possible. I find it’s easier to ignore the social scoring by committing to not reciprocate.

    1. 4

      A important step for me was to consciously avoid the dopamine triggers by removing all my stars and follows

      I’m genuinely curious, what’s your reason for doing that? To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally. That’s kind of what it’s all about for me; having people use and care about things I create is my primary motivation for programming (outside of work, which I do primarily for money).

      1. 15

        To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally.

        Clicking a button doesn’t exactly signal “giving a shit” to me… it requires no effort. What signals (to me) that people give a shit is seeing patches come in, bug reports, random folks on IRC or whatever saying “hi” and thanking me, seeing distros picking up my stuff, and so on. Giving fake internet points doesn’t necessarily mean that anyone gives a shit, at best they’re probably bored or mildly interested enough to click a button and move on to the next shiny thing in their feed.

        1. 4

          Every pull request or email patch I’ve received is a thousand times more meaningful than any star or follow on Github. Those are just pointless.

          Originally stars were for bookmarking, but it’s degenerated into meaningless 👍🏻/+1 noise.

          1. 3

            Exactly - a “star” can simply mean “hey this looks cool”. I’m sure a majority of people who star a project never even tried to use the project. It’s just ego inflation. More important is that people actually use your stuff, in places where it matters. If some project is technically cool but unusable, it could still acquire many many stars.

            1. 3

              I usually star projects so I can find them later.

          2. 7

            I’m genuinely curious, what’s your reason for doing that? To me, those things are the most direct indicators possible that people give a shit about what you’re doing and about you personally

            My bar for that rests at the point that someone gives me their personal feedback on my work in a way that lets me know they have actually read, studied, or used it. That is giving a shit. Competing with the whole world and collecting a few imaginary stars, stickers, or points does not say anything about your work, unless you happen to be a marketeer.

          3. 1

            I set mine to private; everyone can. Eventually all my code will be self-hosted and accessible via an RSS feed and https.

          1. 14
            • 3D printing a flipflop
            1. 4

              This is a great goal! But just one?

              1. 2

                Could just be a portrait of the moment a politician flip-flopped? :)

                1. 2

                  What else do you do when you find one?

                  1. 2

                    Just one, and if successful, the other…

                    1. 2

                      I believe in you! Please post photos of the maiden flight.

                  2. 2

                    JK, SR, or D?

                    1. 6

                      Impressive changelog entry: https://github.com/citusdata/citus/blob/master/CHANGELOG.md#citus-v1102-june-15-2022

                      What I was awaiting the most is:

                      • non-blocking writes during shard rebalancing
                      • tenant isolation
                      • row-level policies
                      • triggers
                      • view (and others) propagation
                      1. 3

                        I’m waiting for proper resource limits. At the moment anything involving columnar tables ignores the memory limits, so it’s trivial to crash the server if your table has the right sort of data.

                        1. 2

                          Indeed, it seems Citus is approaching Vertica territory, but in open source. For what it is worth, Vertica also does not support triggers (unless things changed more recently)

                          1. 3

                            triggers are now officially supported in citus (you could already do it by (ab)using https://docs.citusdata.com/en/v11.0/develop/reference_propagation.html

                        1. 10

                          (author here, any questions or comments and I’ll reply)

                          1. 2

                            meta question: how did you generate the diagrams? Those are neat!

                            1. 4

                              Short answer: a <canvas> tag and requestAnimationFrame().

                              This MDN page describes the technique with a good example, it’s what I learned from to create my page. Probably the simplest animation on my page is here but it isn’t going to be nearly as easy to follow.

                            2. 1

                              The paper you linked describes why the specific polynomial was chosen, but not why P=9 was chosen. Why was P=9 chosen?

                              Also, it seems like Curve25519 takes its name because the modulus is 2^255-19. However, you use the name Curve61 because your modulus is 61. Shouldn’t you name it Curve63 because the modulus is 2^6-3? Although Curve448 wouldn’t fit into this naming scheme at all…

                              1. 1

                                I suspect x=9 is the first x-value with a prime and sufficiently large order. On http://safecurves.cr.yp.to/rigid.html djb hints the same:

                                The usual choice is the generator with smallest possible x-coordinate for short Weierstrass curves or Montgomery curves, or smallest possible y-coordinate for Edwards curves.

                                As for your second point, as the creator of the (flawed, insecure) Curve61 I reserve the right to name it via whatever stupid scheme I want, hah. It wouldn’t have served the document to go into a naming digression when I just needed a name for the toy curve.

                                1. 2

                                  I’ll admit I spent an embarrassingly long time looking for where 25519 showed up as the modulus.

                            1. 8

                              No plan survives contact with the enemy

                              1. 14

                                — the guy who successfully planned out the Franco-Prussian War thirteen years in advance

                                1. 6

                                  It’s much easier to get programmers to throw away documentation than it is to get them to throw away code. And writing at least some documentation up-front, explaining how one would be expected to interact with the eventual code, often does a great job of exposing potential problems.

                                  Or as the Zen of Python succinctly puts it:

                                  If the implementation is hard to explain, it’s a bad idea.

                                  If the implementation is easy to explain, it may be a good idea.

                                1. 6

                                  Reminds me a bit of the talk “Discovering Python” by David Beazley, only in his case he had to implement half of the coreutils from scratch in python instead of downloading them because the computer was completely locked down.

                                  https://www.youtube.com/watch?v=RZ4Sn-Y7AP8

                                  1. 3

                                    That was a really interesting talk. Thanks for linking to it.

                                  1. 10

                                    Joe needed to transfer a number of files between two computers, but didn’t have an FTP server. “But I did have distributed Erlang running on both machines” (quote from the book, not the blogpost), and he then proceeded to implement a file transfer program in, what? 5-10 lines of Erlang, depending on how you count? Beautiful.

                                    1. 6

                                      It’s certainly impressive. As someone used to corporate environments 15 years after Joe wrote this, I’m even more amazed at the network access he enjoyed.

                                      1. 2

                                        Ye-ess, the network access and ability to start/access a server program is key, isn’t it?

                                        If it were SSH instead of Erlang, one could write

                                        ssh joe@server sh -c "base64 < myfile" | base64 --decode > myfile.copy
                                        

                                        to much the same effect — this is not downplaying at Erlang at all, it’s a second illustration of how powerful a remote procedure call sytem can be. (And SSH can only return text, imagine having Erlang where your RPC can return any data structure a local call can. (In part because Erlang deliberately limits its data structures to ones that can be passed as messages.))

                                        1. 3

                                          If it were SSH instead of Erlang, one could write

                                          The funny thing is that iirc SCP works like this (or has fallbacks to work like this)

                                          And SSH can only return text, imagine having Erlang where your RPC can return any data structure a local call can

                                          SSH can transmit arbitrary data. That’s how terminal control characters still work.

                                          1. 2

                                            Heh, I’ve used ssh cat a few times when scp and sftp weren’t available or didn’t function.

                                    1. 2

                                      related: if you are copying between two fds you can use copy_file_range

                                      1. 7

                                        This library is nice, but I really wish there were better tools for guiding it. Generating examples can easily balloon to be the most time-consuming part of testing. Often one has to resort to hard caps (min or max, reject(), max_examples, etc) just to get the sample space down to something reasonable. IMO this weakens the library’s usefulness, since you are no longer testing as many edge cases. What I’d really like is either coverage-based fuzzing, or some other method to say “spend most of your entropy on these parameters.”

                                        Example test illustrating the above problem

                                        1. 2

                                          Agreed! (both on the “nice library” assertion, and on effort curating input shapes)

                                          By pure coincidence, just yesterday, I wrapped up one increment of work to help make hypothesis integrate with fuzzers and symbolic execution tools better. (issue) There is still plenty to do, though.

                                          My labor of love is project is a symbolic execution tool that gained (inefficient) hypothesis support late last year. The example in my writeup speaks to some of your concerns, perhaps.

                                        1. 1

                                          the nice thing about *nix is that you don’t have to restart the machine unless the misbehaving process is teh kernel or init (there is kexec etc. but that doesn’t have the well-testedness avantage)

                                          1. 8

                                            The whole point of this article is that this statement is untrue.

                                            It is not uncommon for machines regardless of OS[1] to all get into a state where they’re globally pantsed - obviously some are worse at this than others (or is that better at it?). Oftentimes the result is just terrible performance, sometimes complete inability to make forward progress. It is possible there is a single faulty process, and an OS that has robust process management can deal with that. However often times you can’t isolate the fault to a single process, so you start having to restart increasingly large amounts of user space. At some point, restarting the system means is the most sensible path forward as it guarantee-ably gets your entire system into a non-pantsed state.

                                            A lot of system reliability engineering is the misery of debugging systems once they’re stuck to try and work out how they got into that stuck state, while also being continuously pinged by people wanting things to be running again.

                                            [1] I’ve worked with, and encountered “just reboot it” level problems with a variety of linuxes over the years (1.x,2.x,3.x I don’t think I’ve used 4+ in any work situation), macOS 7.* (and people complain about windows), all of the OSX/macOSs at varying levels of stability, windows (weirdly one of the most stable machines I ever had was this compaq windows Me thing), VAX/VMS, freeBSD, and I’m sure at least a couple of others in a general mucking around during uni setting

                                            1. 5

                                              Ooooh, did I ever tell you that thing about the uptime log :-D?

                                              So my first serious computer gig, back in 2002, eventually had me also helping the sysadmin who ran things at $part_time_job, which he graciously agreed to when I told him I wanted to learn a thing or two about Unix. One of the things we ran was a server – aptly called scrapheap – which ran all the things that people liked, but which management could not be convinced to pay for, or in any case, could not be convinced to pay enough. It ran a public-ish (invite-only) mailing list with a few hundred subscriber, a local mirror for distributed updates and packages, and a bunch of other things, all on a beige box that had been cobbled together out of whatever hardware was laying around.

                                              Since a lot of people spread around four or five offices in two cities ended up depending on it being around, it had excellent uptime (in fact I think it was rebooted only five or six times between 1999-ish when it was assembled and 2005 – a few times for hardware failure/upgrades, once to migrate it to Linux, and once because of some Debian shenanigans).

                                              On the desk under which it was parked laid a clipboard with what we affectionately called “the uptime log”. The uptime log listed all the things that had been done in order to keep the thing running without rebooting it, because past experience had taught us you never know how one of these is going to impact the system on the next boot, and nobody remembers what they did six months ago. Since the system was cobbled together from various parts it was particularly important because hardware failure was always a possibility, too.

                                              The uptime log was not pretty. It included things like:

                                              • Periodic restart of samba daemon done 04.11.2002 21:30 (whatever), next one scheduled 09.11.2002 because $colleague has a deadline on 11.11 and they really need it running. I don’t recall why but we had to restart it pretty much weekly for a while, otherwise it bogged down. Restarts were scheduled not necessarily so as not to bother people (they didn’t take long and they were easy to do early in the morning) but mostly so as to ensure that they were performed shortly before anyone really needed it, when it was the fastest.
                                              • Changed majordomo resend params (looked innocuous, turned out to be relevant: restarting sendmail after an update presumably carried over some state, and it worked prior to the reboot, but not afterwards. That’s how I discovered Pepsi and instant coffee are a bad mix).
                                              • Updated system $somepackage (I don’t remember what it was, some email magic daemon thing). Separately installed old version under /opt/whatever. Amended init script to run both correctly but I haven’t tested it, doh.

                                              It was a really nasty thing. We called scrapheap various endearing names like “stubborn”, “quirky” or “prickly” but truth is everyone kindda dreaded the thing. I was the only one who liked it, mainly because – being so entangled, and me being a complete noob at these things – I rarely touched it.

                                              You could say well, Unix and servers were never meant to be used like that, you should’ve been running a real machine first of all and second of all it was obviously really messy and you could’ve easily solved all that by partitioning services better and whatnot. Thing is we weren’t exactly sh%&ing money, the choice was between this and running mailing lists by aviary carriers so I bet anyone who has to do similar things today, on a budget that’s not exactly a YC-funded startup or investment bank budget, is really super glad for Docker or jails or whatever they’re using.

                                              1. 2

                                                It is not uncommon for machines regardless of OS[1] to all get into a state where they’re globally pantsed - obviously some are worse at this than others (or is that better at it?)

                                                You’re absolutely right, though it’s also the case that Unixes have a lot more scopes that you can restart from initial conditions than other common OSes. Graphical program doesn’t work, and restarting it doesn’t help? Log out and log back in. That doesn’t fix it? Go to console and restart your window system. That doesn’t fix it? Go to single-user mode, then bring it back up to multi-user. Once that’s exhausted is when you need to reboot…

                                                Of course, just rebooting would be faster than trying all of these one after another. Usually.

                                                1. 1

                                                  The whole point of this article is that this statement is untrue.

                                                  I can tell you that I almost never restart my whole computer. Certainly, the software I write has been much better tested when restarting just the service, and the “whole computer restart” has not. An easy example is that service ordering may not be properly specified, which is OK when everything is already running, but not OK when booting up.

                                                  Unix ain’t Windows. If you aren’t working on the kernel or PID 1 you almost never have to restart.

                                                  1. 3

                                                    Back in early 2000s, when I had win2k/new XP, and linux systems. All of them went similarly long periods between reboots, measured in weeks. But even then, manually rebooting any of those was not an uncommon event.

                                                    Now these days of course, most systems - including nixes - have security updates requiring reboots at that kind of cadence, thus requiring reboots which presumably mitigates any potential “reboot fixed” issues.

                                                    1. 2

                                                      You’d think so, but I have managed to bring GPU’s into a broken state, where anything trying to communicate with them just hangs. Restart was the only way out.

                                                  2. 1

                                                    About 70% of the time that I upgrade libc at least one thing is totally hosed until reboot. And that thing might be my desktop environment, in which case “restarting that process” is exactly the same level of interruption as rebooting, just less thorough.

                                                    1. 1

                                                      Are you, by any chance, from Ontario? [I ask because I’ve only every heard Ontarians use the term “hosed” that way & very much want to know if you are an exception to this pattern.]

                                                      1. 1

                                                        Nope! I’m from Virginia and have lived in Massachusetts for the past 10-ish years. It’s a term I hear people use from time to time, but I haven’t happened to notice any pattern to who. It’s likely that it spread in some subcultures or even in just some social subgraphs.

                                                        1. 1

                                                          Good to know! Thanks :)

                                                  1. 16

                                                    Stop using laptops. For the same money you can get a kickassssss workstation.

                                                    1. 27

                                                      But then for the time you want to work away from the desk you need an extra laptop. Not everyone needs that of course, but if you want to work remotely away from home or if you do on-call, then laptop’s a requirement.

                                                      1. 6

                                                        Laptops also have a built-in UPS! My iMac runs a few servers on the LAN and they all go down when there’s a blackout.

                                                        1. 2

                                                          Curious in which country you live that this is a significant enough problem to design for it?

                                                          1. 5

                                                            Can’t speak about the other poster, but I think power distribution in the US would qualify as risky. And not only in rural areas. consider that even Chicago burbs don’t have buried power lines. And every summer there’s the blackout due to AC surges. I’d naively expect at least 4 or 5 (brief) blackouts per year

                                                      2. 8

                                                        i get that, but it’s also not a very productive framework for discussion. i like my laptop because i work remotely – 16GB is personally enough for me to do anything i want from my living room, local coffee shop, on the road, etc. i do junior full-stack work, so that’s likely why i can get away with it. obviously, DS types and other power hungry development environments are better off with a workhorse workstation. it’s my goal to settle down somewhere and build one eventually, but it’s just not on the cards right now; i’m moving around quite a bit!

                                                        my solution? my work laptop is a work laptop – that’s it. my personal laptop is my personal laptop – that’s it. my raspberry pi is for one-off experiments and self-hosted stuff – that’s it. in the past, i’ve used a single laptop for everything, and frequently found it working way too hard. i even tried out mighty for a while to see if that helped ((hint: only a little)). separation of concerns fixed it for me! obviously, this only works if your company supplies a laptop, but i would go as far as to say that even if they don’t it’s a good alternative solution, and might end up cheaper.

                                                        my personal laptop is a thinkpad i found whilst trash-hopping in the bins of the mathematics building at my uni. my raspberry pi was a christmas gift, and my work laptop was supplied to me. i spend most of my money on software, not really on the hardware.

                                                        edit: it’s also hard; since i have to keep things synced up. tmux and chezmoi are the only reasonable way i’ve been able to manage!

                                                        1. 6

                                                          Agree. The ergonomics of laptops are seriously terrible.

                                                          1. 7

                                                            Unfortunately I don’t think this is well known to most programmers. Recently a fairly visible blogger posted his workstation setup and the screen was positioned such that he would have to look downward just like with a laptop. It baffled many that someone who is clearly a skilled programmer could be so uninformed on proper working ergonomics and the disastrous effects it can have on one’s posture and long-term health.

                                                            Anyone who regularly sits at a desk for an extended period of time should be using an eye-level monitor. The logical consequence of that is that laptop screens should only be used sparingly or in exceptional circumstances. In that case, it’s not really necessary to have a laptop as your daily driver.

                                                            1. 6

                                                              After many years of using computers I don’t see a big harm of using a slightly tilted display. If anything a regular breaks and stretches/exercises make a lot more difference, especially in long term.

                                                              If you check out jcs’ setup more carefully you’ll see that the top line is not that much lower from the “default” eye-line so ergonomics there works just fine.

                                                              1. 1

                                                                We discuss how to improve laptop ergonomics and more at https://reddit.com/r/ergomobilecomputers .

                                                                (I switched to a tablet PC, the screen is also tilted a bit but raised closer to eye level. Perhaps the photo in the ‘fairly visible blogger’s setup was setup for the photo and might be raised higher normally)

                                                            2. 2

                                                              That assumes you’re using the laptop’s built-in keyboard and screen all day long. I have my laptop hooked up to a big external monitor and an ergonomic keyboard. The laptop screen acts as a second monitor and I do all my work on the big monitor which is at a comfortable eye level.

                                                              On most days it has the exact same ergonomics as a desktop machine. But then when I occasionally want to carry my work environment somewhere else, I just unplug the laptop and I’m good to go. That ability, plus the fact that the laptop is completely silent unless I’m doing something highly CPU-intensive, is well worth the loss of raw horsepower to me.

                                                            3. 2

                                                              A kickass workstation which can’t be taken into the hammock, yes.

                                                              1. 1

                                                                I bought a ThinkStation P330 2.5y ago and it is still my best computing purchase. Once my X220 dies, if ever, then I will go for a second ThinkStation.

                                                                1. 3

                                                                  A few years ago I bought an used thinkcentre m92. Ultra small form factor. Replaced the hard drive with a cheap SSD and threw in extra RAM and a 4k screen. Great set up. I could work very comfortably and do anything I want to do on a desktop. Including development or whatching 4k videos. I used that setup for five years and have recently changed to a 2 year old iMac with an Intel processor so I can smoothly run Linux on it.

                                                                  There is no way I am suffering through laptop usage. I see laptops as something suited for sales people, car repair, construction workers and that sort of thing. For a person sitting a whole day in front of the screen… No way.

                                                                  I don’t get the need for people to be able to use their computers in a zillion places. Why? What’s so critical about it? How many people actually carries their own portable office Vs just doing their work on their desks before the advent of the personal computer? We even already carry a small computer in our pocket att all times that fills up lot of personal work needs such as email, chat, checking webpages, conference calls, etc. Is it really that critical to have a laptop?

                                                                  1. 4

                                                                    I don’t get the need for people to be able to use their computers in a zillion places. Why? What’s so critical about it?

                                                                    I work at/in:

                                                                    1. The office
                                                                    2. Home office
                                                                    3. Living room

                                                                    The first two are absolutely essential, the third is because if I want to do some hobbyist computing, it’s not nice if I disappear in the home office. Plus my wife and I sometimes both work at home.

                                                                    Having three different workstations would be annoying. Not everything is on Dropbox, so I’d have to pass files between machines. I like fast machines, so I’d be upgrading three workstations frequently.

                                                                    Instead, I just use a single MacBook with an M1 Pro. Performance-wise it’s somewhere between a Ryzen 5900X and 5950X. For some things I care about for work (matrix multiplication), it’s even much faster. We have a Thunderbolt Dock, 4k screen, keyboard and trackpad at each of these desks, so I plug in a single Thunderbolt cable and have my full working environment there. When I need to do heavy GPU training, I SSH into a work machine, but at least I don’t have a terribly noisy NVIDIA card next to me on or under the desk.

                                                                    1. 3

                                                                      The first two are absolutely essential, the third is because if I want to do some hobbyist computing, it’s not nice if I disappear in the home office.

                                                                      I believe this is the crux of it. It boils down to personal preference. There is no way I am suffering to the horrible experience of using a laptop because it is not nice to disappear to the office. If anything, it raises the barrier to be in front of a screen.

                                                                    2. 2

                                                                      Your last paragraph is exactly my thoughts. Having a workstation is a great way to reduce lazy habits IMNSHO. Mobility that comes with a laptop is ultimately a recipe for neck pain, strain in arms and hands and poor posture and habits.

                                                                      1. 6

                                                                        I have 3 places in which I use my computer (a laptop). In two of them, I connect it to an external monitor, mouse and keyboard, and I do my best to optimize ergonomics.

                                                                        But the fact that I can take my computer with me and use it almost anywhere, is a huge bonus.

                                                                1. 1

                                                                  How did I figure this out? Why, through the power of friendship! No seriously, I just asked my friend and she told me to do this and it worked. I don’t know where else you’d find this information on your own.

                                                                  It’s also listed in the documentation.

                                                                  There’s a number of ways to automate this, simplest of which is probably baking a boot script into the u-boot image.

                                                                  The simplest way is something like

                                                                  => env set bootcmd 'my awesome boot command'
                                                                  => env save
                                                                  
                                                                  1. 16

                                                                    Fully agree. I also think it’s important to note that RAII is strictly more powerful than most other models in that they can be implemented in RAII. Some years ago I made this point for implementing defer in Rust: https://code.tvl.fyi/about/fun/defer_rs/README.md

                                                                    1. 5

                                                                      What other models are you talking about? Off the top of my head, linear types are more powerful; with-macros (cf common lisp) are orthogonal; and unconstrained resource management strategies are also more powerful.

                                                                      1. 2

                                                                        unconstrained resource management strategies are also more powerful.

                                                                        How is that more “powerful” in OP’s sense? Can you implement RAII within the semantics of such a language?

                                                                        1. 5

                                                                          I should perhaps have said ‘expressive’. There are programs you can write using such semantics that you cannot write using raii.

                                                                      2. 2

                                                                        That’s interesting, but it wouldn’t work in Go because of garbage collection. You could have a magic finalizer helper, but you wouldn’t be able to guarantee it runs at the end of a scope. For a language with explicit lifetimes though, it’s a great idea.

                                                                        1. 11

                                                                          Lua (which has GC) has the concept of a “to-be-closed”. If you do:

                                                                          local blah <close> = ...
                                                                          

                                                                          That variable will be reclaimed when it goes out of scope right then and there (no need to wait for GC). Also, if an object has a __close method, it will be called at that time.

                                                                          1. 7

                                                                            Sounds like the Python with statement.

                                                                          2. 8

                                                                            It doesn’t have to be such a clear-cut distinction. C# is a GC’d language but also has a using keyword for classes that implement IDispose, which runs their finaliser at the end of a lexical scope. This can be used to implement RAII and to manage the lifetimes of other resources.

                                                                            1. 2

                                                                              What do you do when you want the thing to live longer? For the Rust case, you just don’t let the variable drop. For Go, you can deliberately not defer a close/unlock. What do you do in C#?

                                                                              1. 3

                                                                                What do you do in C#?

                                                                                Hold onto a reference, don’t use using.

                                                                                1. 1

                                                                                  Ah. Seems a lot like with in Python.

                                                                                  1. 1

                                                                                    Basically except I believe C# will eventually run Dispose if you don’t do it explicitly unlike Python. I can’t find evidence of when C# introduced using but IDisposable has been there since 1.0 in 2002 while Python introduced with since 2.5 in 2005.

                                                                                    1. 2

                                                                                      Python explicitly copied several features from C#. I wouldn’t be surprised if with was inspired by using.

                                                                              2. 1

                                                                                That’s a nice approach. Wish more languages would do something like this.

                                                                                1. 2

                                                                                  It still suffers from the point in the article where you don’t know who held a reference to your closeable thing and it’s not always super clear what is IDisposable in the tooling. I think VS makes you run the code analytics (whatever the full code scan is called) to see them.

                                                                                  1. 5

                                                                                    Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?

                                                                                    It seems mainstream programming jumped to GC-all-the-things back in the 90s with Java/C# in response to the endless problems commercial outfits had with C/C++, but they seem to have thrown the proverbial baby out with the bathwater in the process. RAII is fantastic & it wasn’t until Rust came along & nicked it from C++ that anyone else really sat up & took notice.

                                                                                    1. 2

                                                                                      Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?

                                                                                      D?

                                                                          1. 23

                                                                            RAII is far from perfect… here are a few complaints:

                                                                            1. Drop without error checking is also wrong by default. It may not be a big issue for closing files, but in the truly general case of external resources to-be-freed, you definitely need to handle errors. Consider if you wanted to use RAII for cloud resources, for which you need to use CRUD APIs to create/destroy. If you fail to destroy a resource, you need to remember that so that you can try again or alert.

                                                                            2. High-performance resource management utilizes arenas and other bulk acquisition and release patterns. When you bundle a deconstructor/dispose/drop/whatever into some structure, you have to bend over backwards to decouple it later if you wish to avoid the overhead of piecemeal freeing as things go out of scope.

                                                                            3. The Unix file API is a terrible example. Entire categories of usage of that API amount to open/write/close and could trivially be replaced by a bulk-write API that is correct-by-default regardless of whether it uses RAII internally. But beyond the common, trivial bulk case, most consumers of the file API actually care about file paths, not file descriptors. Using a descriptor is essentially an optimization to avoid path resolution. Considering the overhead of kernel calls, file system access, etc, this optimization is rarely valuable & can be minimized with a simple TTL cache on the kernel side. Unlike a descriptor-based API, a path-based API doesn’t need a close operation at all – save for some cases where files are being abused as locks or other abstractions that would be better served by their own interfaces.

                                                                            4. It encourages bad design in which too many types get tangled up with resource management. To be fair, this is still a cultural problem in Go. I see a lot of func NewFoo() (*Foo, error) which then of course is followed by an error check and potentially a .Close() call. Much more often than not, Foo has no need to manage its resources and could instead have those passed in: foo := &Foo{SomeService: svc} and now you never need to init or cleanup the Foo, nor check any initialization errors. I’ve worked on several services where I have systematically made this change and the result was a substantial reduction in code, ultimately centralizing all resource acquisition and release into essentially one main place where it’s pretty obvious whether or not cleanup is happening.

                                                                            1. 3

                                                                              This is super informative, thanks! Probably worth it to turn this comment into a post of its own.

                                                                              1. 3

                                                                                The error handling question for RAII is a very good point! This is honestly where I’m pretty glad with Python’s exception story (is there a bad error? Just blow up! And there’s a good-enough error handling story that you can wrap up your top level to alert nicely). As the code writer, you have no excuse to, at least, just throw an exception if there’s an issue that the user really needs to handle.

                                                                                I’ll quibble with 2 though. I don’t think RAII and arenas conflict too much? So many libraries are actually handlers to managed memory elsewhere, so you don’t have to release memory the instant you destruct your object if you don’t want to. Classically, reference counted references could just decrement a number by 1! I think there’s a a lot of case-by-case analysis here but I feel like common patterns don’t conflict with RAII that much?

                                                                                EDIT: sorry, I guess your point was more about decoupling entirely. I know there are libs that parametrize by allocator, but maybe you meant something even a bit more general

                                                                                1. 2

                                                                                  It may not be a big issue for closing files

                                                                                  from open(2):

                                                                                  A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close() that releases the open file description. Failing to check the return value when closing a file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.

                                                                                  so it’s actually quite important to handle errors from close!

                                                                                  The reason for this is that (AFAIK) write is just a request to do an actual write at some later time. This lets Linux coalesce writes/reschedule them without making your code block. As I understand it this is important for performance (and OSs have a long history of lying to applications about when data is written). A path-based API without close would make it difficult to add these kinds of optimizations.

                                                                                  1. 1

                                                                                    My comment about close not being a big issue is with respect to disposing of the file descriptor resource. The actual contents of the file is another matter entirely.

                                                                                    Failing to check the return value when closing a file may lead to silent loss of data.

                                                                                    Is that still true if you call fsync first?

                                                                                    A path-based API without close would make it difficult to add these kinds of optimizations.

                                                                                    Again, I think fsync is relevant. The errors returned from write calls (to either paths or file descriptors) are about things like whether or not you have access to a file that actually exists, not whether or not the transfer to disk was successful.

                                                                                    This is also related to a more general set of problems with distributed systems (which includes kernel vs userland) that can be addressed with something like Promise Pipelining.

                                                                                  2. 1
                                                                                    1. Note that in the context of comparison with defer, it doesn’t improve the defaults that much. The common pattern of defer file.close() doesn’t handle errors either. You’d need to manually set up a bit of shared mutable state to replace the outer return value in the defer callback before the outer function returns. OTOH you could throw from a destructor by default.

                                                                                    2. I disagree about “bend over backwards”, because destructors are called automatically, so you have no extra code to refactor. It’s even less work than finding and changing all relevant defers that were releasing resources piecemeal. When resources are owned by a pool, its use looks like your fourth point, and the pool can release them in bulk.

                                                                                    3/4 are general API/architecture concerns about resource management, which may be valid, but not really specific to RAII vs defer, which from perspective of these issues are just an implementation detail.

                                                                                    1. 1

                                                                                      I disagree about “bend over backwards”, because destructors are called automatically, so you have no extra code to refactor

                                                                                      You’re assuming you control the library that provides the RAII-based resource. Forget refactoring, thinking only about the initial code being written: If you don’t control the resource providing library, you need to do something unsavory in order to prevent deconstructors from running.

                                                                                      1. 1

                                                                                        Rust has a ManuallyDrop type wrapper if you need it. It prevents destructors from running on any type, without changing it.

                                                                                        Additionally, using types via references never runs destructors when the reference goes out of scope, so if you refactor T to be &T coming from a pool, it just works.

                                                                                      2. 1

                                                                                        The common pattern of defer doesn’t handle errors either.

                                                                                        In Zig you have to handle errors or explicitly discard them, in defer and everywhere else.

                                                                                      3. 1

                                                                                        Using a descriptor is essentially an optimization to avoid path resolution.

                                                                                        I believe it also the point at which permissions are validated.

                                                                                        Entire categories of usage of that API amount to open/write/close and could trivially be replaced by a bulk-write API

                                                                                        Not quite sure I understand what you mean… Something like Ruby IO.write()? https://ruby-doc.org/core-3.1.2/IO.html#method-c-write

                                                                                        1. 3

                                                                                          also the point at which permissions are validated

                                                                                          I think that’s right, though the same “essentially an optimization” comment applies, although the TTL cache option solution is less applicable.

                                                                                          Something like Ruby IO.write()

                                                                                          Yeah, pretty much.

                                                                                      1. 1

                                                                                        your “correct” sound is very loud

                                                                                        please add a volume slider (which is both nice to adjust and serves as a warning that there is sound)

                                                                                        1. 16

                                                                                          This is an excellent resource! I worked on a feed reader from 2003-2007, and broken feeds were a constant annoyance. A lot of this seemed to be caused by generating the feed using the same template engine as the HTML but not taking account of the fact that it’s supposed to be XML

                                                                                          I hope the situation is better now, but the major mistakes I saw then were:

                                                                                          • Invalid XML, probably caused by sloppy code generating it. People get used to sloppy HTML because browsers are forgiving, but XML is stricter and doesn’t allow improper nesting or unquoted attributes.
                                                                                          • HTML content embedded unquoted in the XML. This can be done legally, but the content has to be valid XHTML, else it breaks the feed. If in doubt, wrap CDATA around your HTML.
                                                                                          • Incorrectly escaped text. It’s not hard to XML-escape text, but people managed to screw it up. Get it wrong one way and it breaks the XML; or you can double-escape and then users will see garbage like “&quot;” in titles and content.
                                                                                          • Bad text encoding. Not declaring the encoding and making us guess! Stating one encoding but using another! An especially “fun” prank was to use UTF-8 for most of the feed but have the content be something else like ISO-8859.
                                                                                          • Badly-formatted dates. This was a whole subcategory … using the wrong date format, or localized month names, or omitting the time zone, or other more creative mistakes.
                                                                                          • Not using entry UUIDs and then changing the article URLs. Caused lots of complaints like “the reader marked all the articles unread again!”
                                                                                          • Serving the feed as dynamic content without a Last-Modified or Etag header. Not technically a mistake, but hurts performance on both sides due to extra bandwidth and the time to generate and parse.

                                                                                          Fortunately you can detect nearly all these by running the feed through a validator. Do this any time you edit your generator code/template.

                                                                                          For anyone wanting to write a feed reader: you’ll definitely want something like libTidy, which can take “tag soup” and turn it into squeaky clean markup. Obviously important for the XML, but also for article HTML if you plan on embedding it inside a web page — otherwise errors like missing close tags can destroy the formatting of the enclosing page. LibTidy also improves security by stripping potentially dangerous stuff like scripts.

                                                                                          The one thing in this article I disagree with is the suggestion to use CSS to style article content. It’s bad aesthetically because your articles will often be shown next to articles from other feeds, and if every article has its own fonts and colors it looks like a mess. Also, I think most readers will just strip all CSS (we did) because there are terrible namespace problems when mixing unrelated style sheets on the same page.

                                                                                          PS: For anyone doing research on historical tech flame wars, out-of-control bikeshedding, and worst-case scenarios of open data format design — the “feed wars” of the early/mid Oughts are something to look at. Someone (Mark Pilgrim?) once identified no less than eleven different incompatible versions of RSS, some of which didn’t even have their own version numbers because D*ve W*ner used to like to make changes to the RSS 2.0 “spec” (and I use that term loosely) without bumping the version.

                                                                                          1. 5

                                                                                            Not using entry UUIDs and then changing the article URLs. Caused lots of complaints like “the reader marked all the articles unread again!”

                                                                                            I have unsubscribed from certain blogs because of this. It’s no fun when they keep “posting” the last 10 articles all the time…

                                                                                            1. 1

                                                                                              It drives me mad when I occasionally update my feeds and suddenly have tens, or hundreds (!) of “new” articles.

                                                                                              Doesn’t happen often enough that I’d want to delete the feed, but still very annoying.

                                                                                            2. 3

                                                                                              Someone (Mark Pilgrim?) once identified no less than eleven different incompatible versions of RSS,

                                                                                              I suspect this is because of intense commitment to the robustness principle (Postel’s Law). Tim Bray rebutted Dave Winer and Aaron Swartz’s frankly goofy devotion to this idea. I think it’s better to follow Bray’s advice.

                                                                                              1. 6

                                                                                                Actually it was Pilgrim and Aaron Schwartz he was rebutting in that blog post, not Winer.

                                                                                                And the 11-different-versions madness had nothing to do with liberal parsers, but with custody battles, shoehorning very different formats under the same name (RDF vs non-RDF), Winer’s allergy to writing a clear detailed spec or at least versioning his changes to it, and various other people’s ego trips.

                                                                                                In my experience, writing a liberal parser was a necessity because large and important feed publishers were among those serving broken feeds, and when your client breaks on a feed, users blame you, said users including your employer’s marketing department. Web browsers have always been pretty liberal for this reason.

                                                                                                1. 1

                                                                                                  Actually it was Pilgrim and Aaron Schwartz he was rebutting in that blog post, not Winer.

                                                                                                  Oh, right. Typed the wrong name there. Not gonna go back and edit it, though.

                                                                                              2. 2

                                                                                                There’s one good alternative to using UUIDs: tag URIs. They have one benefit over UUIDs: they’re human readable.

                                                                                                I remember the feed wars! Winer’s petulance caused so much damage. I haven’t used anything but Atom since then for anything I publish, and I advise people to give the various flavours of RSS a wide berth.

                                                                                              1. 15

                                                                                                2FA/MFA became so annoying I’m now meaning to get an android emulator running just to get these passcodes. I’m sick of having to grab my phone, unlock it, open some app or wait for a text, rush to type it in before it resets etc. Such a huge pain in the ass.

                                                                                                1. 10

                                                                                                  1Password includes TOTP for mimicking 2FA. Bitwarden does as well, but it’s a bit clunkier.

                                                                                                  1. 10

                                                                                                    It’s not really “mimicking” — it’s a TOTP app generating codes the same way as any other TOTP app.

                                                                                                    1. 13

                                                                                                      It’s mimicking that there is a second factor, which often times implies a second, isolated piece of hardware

                                                                                                    2. 10

                                                                                                      KeepassXC, a non-subscription-based open source password manager, supports TOTP too.

                                                                                                    3. 6

                                                                                                      It took me a little while to get used to remembering my Yubikey, but that’s been pretty great for me. I have one that’s USB-C on one end and Apple Lightning on the other. Also, if I’d switch to a Chromium-based browser, I could use the Mac’s Touch ID for 2FA (Firefox on Mac doesn’t support it, though).

                                                                                                      Disclosure: GitHub employee, but not involved with this security effort.

                                                                                                      1. 4

                                                                                                        On Windows, you can use a TPM and on iOS / Android you can use their credential manager (which is secure on iOS and may or may not be secure on Android depending on how much of a cheapskate the handset manufacturer was). GitHub has done a fantastic job on making this usable. I haven’t used a password with GitHub for a few years for anything other than adding a new device.

                                                                                                        Disclosure: Microsoft employee, but not working directly with GitHub on anything, just a very happy user (of everything except their complete misunderstanding of the principles of least privilege and intentionality in their application of the Zero Trust buzzword).

                                                                                                      2. 4

                                                                                                        keepassxc allows storing 2FA tokens

                                                                                                        1. 3

                                                                                                          you can use oathtool to generate them directly

                                                                                                          1. 2

                                                                                                            Buy a USB-A U2F key and leave it permanently plugged into the computer

                                                                                                            1. 1

                                                                                                              Store TOTP secret somewhere (I have it in Bitwarden, it allows me to also generate tokens directly through official clients) and run it through oauthtool to generate singe use token. On my setup I can generate & paste a token with ydotool with a single key stroke.

                                                                                                              1. 1

                                                                                                                2FA on GitHub rarely shows up; I do have it enabled, and I pretty much don’t need to enter a second factor through daily usage. It’s the same as 2FA with Google, which is rarely needed through daily usage. It’s pretty much for sensitive operational changes to accounts (repos in this case I guess), logging in from new devices, or from a device that hasn’t been used in a while. Other platforms are a bit more annoying, for sure, but I feel GitHub gets the balance right in this regard. I’m actually surprised they’re making it almost 1.5 years away of enforcing though… that seems a bit too long IMO.

                                                                                                                1. 1

                                                                                                                  My OnlyKey covers FIDO2 and TOTP inputs with easy. It came with a keychain so it stays right next to my home key and my motorbike key so it’s hard to forget about it.

                                                                                                                  Passsword Store on syncing between Linux and Android has worked well aditionally and the OTP plugin covers that aspect as well.

                                                                                                                  1. 1

                                                                                                                    I have a template Perl script I use for TOTP. I copy it over and put in the new key, and run it from the shell to get a TOTP code. I try very hard not to let them use my phone for this.

                                                                                                                    1. 1

                                                                                                                      I try very hard not to let them use my phone for this.

                                                                                                                      Why though?

                                                                                                                      1. 1

                                                                                                                        If I lose my phone, I’m potentially screwed, depending on what recovery mechanisms there are. But I can back up a Perl script and store it securely.

                                                                                                                        1. 1

                                                                                                                          you can back-up the QR code from the TOTP app too. Also github gives you backup codes to print out.

                                                                                                                  1. 4

                                                                                                                    IMO this is a good candidate for a partitioned table. Partition by (e.g.) one month, and you can delete an entire month’s worth of data just by dropping the table.

                                                                                                                    1. 2

                                                                                                                      The author mentions this at the bottom of the article.

                                                                                                                      1. 1

                                                                                                                        So they do.

                                                                                                                      2. 1

                                                                                                                        That may or may not be useful on SSDs/NVMe. One of the world’s MySQL experts that I know had a rant about this: partitioned tables are not faster if you’re not on spinning disks. I don’t remember the details but it was fairly compelling. This obviously depends on your storage engine. For something like Splunk where archival/deletion of cold buckets was a design goal for the storage engine it works fine, but Splunk is automatically partitioned by timestamp anyway.