Threads for cetera

  1. 24

    Heroku was at least a decade ahead of its time as far as developer experience goes.

    For sure. A testament to this fact is that my very first side project ever was deployed to Heroku, and 5+ years later it’s still rock-solid and my highest uptime app. It’s so hard to have abstractions that are leak-proof enough for novices yet still resilient and flexible.

    Using Heroku made me finally understand that sometimes paying for the right software can pay huge dividends in time savings. Running servers well is not trivial!

    1. 9

      I was recently cleaning up old accounts I no longer use, and my Heroku account was among them. I logged in, and saw I still had an app running. It was set up in 2012.

      Heroku was way, way ahead of its time.

    1. 4

      we still have some work to do before we’re ready to share it

      Hope they hurry up and remove the expletives from the commit log! Wonder if there will be enough stuff in the codebase to totally run it onprem, like perhaps a replacement for https://www.openfaas.com/ and others.

      1. 16

        Yes, you’ll be able to run it on prem. This will be a daemon that operates sort of like Node or Deno.

        Sorry for the vaporware announcement. The code was written as an integrated part of the overall Cloudflare stack, and some work is needed to disentangle it into something that can run stand-alone. I would have preferred to finish that work before announcing, but the fact that we’ll be releasing the code is sort of important context for some other announcements this week, like the WinterCG announcement.

      1. 5

        Good writeup.

        I think there’s one more potential attack that’s missing: a vulnerability or misconfiguration in etcd (or wherever else you store your secrets) that allows reading without requiring root access or physical access to the machine. In that scenario, encrypted secrets can provide another layer of security assuming the decryption is done via a separate mechanism.

        I also don’t think “most people aren’t using feature X of Vault” is that strong an argument. You can’t dismiss a tool by insisting people aren’t using it correctly.

        1. 5

          Yeah the dismissiveness of vault is mainly just me ranting. Maybe should have been an aside or footnote because the argument doesn’t rely on people misusing Vault. A properly configured Vault instance is what I ultimately compared to plain Kubernetes Secrets.

          1. 3

            I agree that Vault is a complicated beast (I used to manage Vault at a previous employer) but USP for Vault must be the dynamic secret with TTLs right? So even if you could read the secret from the RAMdisk on the node/pod it would not be usable unless you timed it so that you read it exactly before the service read it but after the injector well injected it.

            1. 2

              Are we talking about https://www.hashicorp.com/resources/painless-password-rotation-hashicorp-vault ?

              My understanding is that while Vault can perform automatic password rotation, it can’t e.g. configure Redis or MySQL to change the password automatically. You could build something that does that for every secret-consuming application, but now vault is relegated to being a random password generator and again could be replaced with plain kubernetes secrets, /dev/urandom, and a cronjob.

              1. 1

                I think the real value from Vault is the policies, not just storage. If a deployment is not taking advantage of that, then yes it’s no better than etcd or anything else.

        1. 3

          I’d personally throw a local git repo or something in there if we’re just editing html files on some server. One bad file glob or “sed” command can delete the only copy of some critical documentation. Plus the git log itself serves as additional documentation.

          1. 1

            The editing on the server made me wince. Build steps can be annoying, but a separate step between development and production is ABSOLUTELY necessary. And NONE of the benefits would be lost.

          1. 3

            Interesting stuff! Maybe my overlay networking knowledge is below the intended audience, but I wish there was more exposition/links about the “well-documented drawbacks” and the “better reliability and scalability characteristics” mentioned.

            The Kubernetes networking space is sadly filled with myths and brand-loyalty since so few of its users actually understand it (myself included, at least about the overlay part since that abstraction has proven to be non-leaky for me). So more writing like this from people who actually understand the stuff is very welcome!

            1. 4

              The Kubernetes network model requires each pod (co-located process group) to have its own IP addresses. The IPv4 address space just isn’t large enough to accommodate that gracefully.

              Let’s say you use 10.0.0.0/8 for internal network addresses. Each datacenter might be allocated a /16, so you can have 255 datacenters with ~65,000 machines each. Sounds great! You’ve got a lot of scaling headroom.

              Now you want to introduce Kubernetes. Each pod gets an IP, so instead of having room for 65,000 big beefy rack-mounted machines, your address space has room for 65,000 tiny ephemeral pods. The operator is forced to choose between two bad options.

              Option A: Static assignment of IPv4 prefixes to each machine. If you allocate a /17 to Kubernetes pod IPs, and have 2000 machines, then you can run at most 16 pods per machine.

              • That’s an order-of-magnitude reduction in maximum per-cluster machine count (65,000 -> 2000).
              • Utilization of machine capacity is low because 16 cpu=1 services can take all the pod IPs on a 100-core machine.
              • Utilization of IPv4 address is low because a single big workload might take up a whole machine (stranding 15 addresses).

              Option B: An overlay network with dynamic assignment of pod IPs from a shared pool. This improves utilization because IPs can be assigned where they’re needed, but now you’ve got a bigger problem: knowing a pod IP is no longer enough information to route a packet to it!

              • Now you need some sort of proxy layer (iptables/nftables, userspace proxy, “service mesh”) running on each machine.
              • Every time a pod gets rescheduled, it gets a new IP, and every machine in the cluster needs to update its proxy configuration. The update rate scales with (cluster size * pod count * pod reschedule rate), which is too many multiplications.

              When you see the dozens of startups involved in Kubernetes networking, which all promise better throughput or lower latency or whatever, they’re fighting for the market in tooling created by “option 2”.

            1. 3

              I am excited for a future where we can strip away the myriad NAT and proxy layers that have been built up to serve microservices (especially in k8s). Sounds like the path to that future is IPv6.

              I believe a single packet flowing through a k8s loadbalancer to an app on another node has to be copied something like 12-14 times (DNAT and SNAT on both sides, in and out of kernel space, etc.) which could be improved considerably with IPV6 replacing all of that NATing.

              1. 3

                See also https://www.conservationmagazine.org/2012/12/heirloom-technology/ (Saul Griffith, 2012) for a perspective from roughly a decade ago.

                1. 2

                  That’s a good read! Unfortunate to see that not a whole lot has changed on this subject after a decade; things are still “progress[ing] in a piecemeal manner” and “still largely conceptual in nature”.

                1. 121

                  I used to give the same advice, but I completely changed my opinion over the past 10 years or so. I eventually put in the time and learned shell scripting. These days my recommendation is:

                  1. Learn to use the shell. It’s a capable language that can take you very far.
                  2. Use ShellCheck to automatically take care of most of the issues outlined in the article.

                  I really don’t want to figure out every project’s nodejs/python/ruby/make/procfile abomination of a runner script anymore. Just like wielding regular expressions, knowing shell scripting is a fundamental skill that keeps paying dividends over my entire career.

                  1. 60

                    Bingo.

                    My advice is:

                    • Always use #!/usr/bin/env bash at the beginning of your scripts (change if you need something else, don’t rely on a particular path to bash though).
                    • Always add set -eou pipefail after that.
                    • Always run shellcheck.
                    • Always run shfmt.
                    • Always pay attention to what version of bash you need to support, and don’t go crazy with “new” features unless you can get teammates to upgrade (this is particularly annoying because Apple ships an older version of bash without things like associative arrays).
                    • Always use the local storage qualifier when declaring variables in a function.
                    • As much as possible, declare things in functions and then at the end of your script kick them all off.
                    • Don’t use bash for heavy-duty hierarchical data munging…at that point consider switching languages.
                    • Don’t assume that a bashism is more-broadly acceptable. If you need to support vanilla sh, then do the work.

                    While some people like the author will cry and piss and moan about how hard bash is to write, it’s really not that bad if you take those steps (which to be fair I wish were more common knowledge).

                    To the point some folks here have already raised, I’d be okay giving up shell scripting. Unfortunately, in order to do so, a replacement would:

                    • Have to have relatively reasonable syntax
                    • Be easily available across all nix-likes
                    • Be guaranteed to run without additional bullshit (installing deps, configuring stuff, phoning home)
                    • Be usable with only a single file
                    • Be optimized for the use case of bodging together other programs and system commands with conditional logic and first-class support for command-line arguments, file descriptors, signals, exit codes, and other nixisms.
                    • Be free
                    • Don’t have long compile times

                    There are basically no programming languages that meet those criteria other than the existing shell languages.

                    Shell scripting is not the best tool for any given job, but across every job it’ll let you make progress.

                    (Also, it’s kinda rich having a Python developer tell us to abandon usage of a tool that has been steadily providing the same, albeit imperfect, level of service for decades. The 2 to 3 switch is still a garbage fire in some places, and Python is probably the best single justification for docker that exists.)

                    1. 26

                      While some people like the author will cry and piss and moan about how hard bash is to write, it’s really not that bad if you take those steps (which to be fair I wish were more common knowledge).

                      I think “nine steps” including “always use two third-party tools” and “don’t use any QoL features like associative arrays” does, in fact, make bash hard to write. Maybe Itamar isn’t just “cry and piss and moan”, but actually has experience with bash and still think it has problems?

                      1. 2

                        To use any language effectively there are some bits of tribal knowledge…babel/jest/webpack in JS, tokio or whatever in Rust, black and virtualenv in Python, credo and dialyzer in Elixir, and so on and so forth.

                        Bash has many well-known issues, but maybe clickbait articles by prolific self-pronoters hat don’t offer a path forward also have problems?

                        1. 15

                          If your problem with the article is that it’s clickbait by a self-promoter, say that in your post. Don’t use it as a “gotcha!” to me.

                          1. 2

                            I think there’s merit here in exploring the criticism, though room for tone softening. Every language has some form of “required” tooling that’s communicated through community consensus. What makes Bash worse than other languages that also require lots of tools?

                            There’s a number of factors that are at play here and I can see where @friendlysock’s frustration comes from. Languages exist on a spectrum between lots of tooling and little tooling. I think something like SML is on the “little tooling” where just compilation is enough to add high assurance to the codebase. Languages like C are on the low assurance part of this spectrum, where copious use of noisy compiler warnings, analyzers, and sanitizers are used to guide development. Most languages live somewhere on this spectrum. What makes Bash’s particular compromises deleterious or not deleterious?

                            Something to keep in mind is that (in my experience) the Lobsters userbase seems to strongly prefer low-tooling languages like Rust over high-tooling languages like Go, so that may be biasing the discussion and reactions thereof. I think it’s a good path to explore though because I suspect that enumerating the tradeoffs of high-tooling or low-tooling approaches can illuminate problem domains where one fits better than the other.

                            1. 2

                              I felt that I sufficiently commented about the article’s thesis on its own merits, and that bringing up the author’s posting history was inside baseball not terribly relevant. When you brought up motive, it became relevant. Happy to continue in DMs if you want.

                            2. 6

                              You’re really quite hostile. This is all over scripting languages? Or are you passive aggressively bringing up old beef?

                          2. 9

                            Integrating shellcheck and shfmt to my dev process enabled my shell programs to grow probably larger than they should be. One codebase, in particular, is nearing probably like 3,000 SLOC of Bash 5 and I’m only now thinking about how v2.0 should probably be written in something more testable and reuse some existing libraries instead of reimplementing things myself (e.g., this basically has a half-complete shell+curl implementation of the Apache Knox API). The chief maintenance problem is that so few people know shell well so when I write “good” shell like I’ve learned over the years (and shellcheck --enable=all has taught me A TON), I’m actively finding trouble finding coworkers to help out or to take it over. The rewrite will have to happen before I leave, whenever that may be.

                            1. 11

                              I’d be interested in what happens when you run your 3000 lines of Bash 5 under https://www.oilshell.org/ . Oil is the most bash compatible shell – by a mile – and has run thousands of lines of unmodified shell scripts for over 4 years now (e.g. http://www.oilshell.org/blog/2018/01/15.html)

                              I’ve also made tons of changes in response to use cases just like yours, e.g. https://github.com/oilshell/oil/wiki/The-Biggest-Shell-Programs-in-the-World


                              Right now your use case is the most compelling one for Oil, although there will be wider appeal in the future. The big caveat now is that it needs to be faster, so I’m actively working on the C++ translation (oil-native passed 156 new tests yesterday).

                              I would imagine your 3000 lines of bash would be at least 10K lines of Python, and take 6-18 months to rewrite, depending on how much fidelity you need.

                              (FWIW I actually wrote 10K-15K lines of shell as 30K-40K lines of Python early in my career – it took nearly 3 years LOL.)

                              So if you don’t have 1 year to burn on a rewrite, Oil should be a compelling option. It’s designed as a “gradual upgrade” from bash. Just running osh myscript.sh will work, or you can change the shebang line, run tests if you have them, etc.

                              There is an #oil-help channel on Zulip, liked from the home page

                              1. 2

                                Thanks for this nudge. I’ve been following the development of Oil for years but never really had a strong push to try it out. I’ll give it a shot. I’m happy to see that there are oil packages in Alpine testing: we’re deploying the app inside Alpine containers.

                                Turns out that I was very wrong about the size of the app. It’s only about 600 SLOC of shell :-/ feels a lot larger when you’re working on it!

                                One thing in my initial quick pass: we’re reliant on bats for testing. bats seemingly only uses bash. Have you found a way to make bats use Oil instead?

                                1. 1

                                  OK great looks like Alpine does have the latest version: https://repology.org/project/oil-shell/versions

                                  I wouldn’t expect this to be a pain-free experience, however I would say should definitely be less effort than rewriting your whole program in another language!

                                  I have known about bats for a long time, and I think I ran into an obstacle but don’t remember what it was. It’s possible that the obstacle has been removed (e.g. maybe it was extended globs, which we now support)

                                  https://github.com/oilshell/oil/issues/297

                                  In any case, if you have time, I would appreciate running your test suite with OSH and letting me know what happens (on Github or Zulip).

                                  One tricky issue is that shebang lines are often #!/bin/bash, which you can change to be #!/usr/bin/env osh. However one shortcut I added was OSH_HIJACK_SHEBANG=osh

                                  https://github.com/oilshell/oil/wiki/How-To-Test-OSH

                                2. 1

                                  Moving away from Python? Now it has my interest… in the past I skipped past know it’d probably take perf hits and have some complicaged setup that isn’t a static binary.

                                  1. 2

                                    Yes that has always been the plan, mentioned in the very first post on the blog. But it took awhile to figure out the best approach, and that approach still takes time.

                                    Some FAQs on the status here: http://www.oilshell.org/blog/2021/12/backlog-project.html

                                    Python is an issue for speed, but it’s not an issue for setup.

                                    You can just run ./configure && make && make install and it will work without Python.

                                    Oil does NOT depend on Python; it just reuses some of its code. That has been true for nearly 5 years now – actually since the very first Oil 0.0.0. release. Somehow people still have this idea it’s going to be hard to install, when that’s never been the case. It’s also available on several distros like Nix.

                                    1. 1

                                      What is the status of Oil on Windows (apologies if it’s in the docs somewhere, couldn’t find any mentioning of this). A shell that’s written in pure C++ and has Windows as a first class citizen could be appealing (e.g. for cross-platform build recipes).

                                      1. 1

                                        It only works on WSL at the moment … I hope it will be like bash, and somebody will contribute the native Windows port :-) The code is much more modular than bash and all the Unix syscalls are confined to a file or two.

                                        I don’t even know how to use the Windows sycalls – they are quite different than Unix! I’m not sure how you even do fork() on Windows. (I think Cygwin has emulation but there is way to do it without Cygwin)

                                        https://github.com/oilshell/oil/wiki/Oil-Deployments

                              2. 4

                                To the point some folks here have already raised, I’d be okay giving up shell scripting. Unfortunately, in order to do so, a replacement would: […] There are basically no programming languages that meet those criteria other than the existing shell languages.

                                I believe Tcl fits those requirements. It’s what I usually use for medium-sized scripts. Being based on text, it interfaces well with system commands, but does not have most of bash quirks (argument expansion is a big one), and can handle structured data with ease.

                                1. 4

                                  Always use #!/usr/bin/env bash at the beginning of your scripts (change if you need something else, don’t rely on a particular path to bash though).

                                  I don’t do this. Because all my scripts are POSIX shell (or at least as POSIX complaint as I can make them). My shebang is always #!/bin/sh - is it reasonable to assume this path?

                                  1. 4

                                    you will miss out on very useful things like set -o pipefail, and in general you can suffer from plenty of subtle differences between shells and shell versions. sticking to bash is also my preference for this reason.

                                    note that the /usr/bin/env is important to run bash from wherever it is installed, e.g. the homebrew version on osx instead of the ancient one in /bin (which doesn’t support arrays iirc and acts weirdly when it comes across shell scripts using them)

                                    1. 4

                                      My shebang is always #!/bin/sh - is it reasonable to assume this path?

                                      Reasonable is very arbitrary at this point. That path is explicitly not mandated by POSIX, so if you want to be portable to any POSIX-compliant system you can’t just assume that it will exist. Instead POSIX says that you can’t rely on any path, and that scripts should instead be modified according to the system standard paths at installation time.

                                      I’d argue that these days POSIX sh isn’t any more portable than bash in any statistically significant sense though.

                                      1. 2

                                        Alpine doesn’t have Bash, just a busybox shell. The annoying thing is if the shebang line fails because there is no bash, the error message is terribly inscrutable. I wasted too much time on it.

                                        1. 2

                                          nixos has /bin/sh and /usr/bin/env, but not /usr/bin/bash. In fact, those are the only two files in those folders.

                                        2. 3

                                          https://mkws.sh/pp.html hardcodes #!/bin/sh. POSIX definitely doesn’t say anything about shs location but I really doubt you won’t find a sh at /bin/sh on any UNIX system. Can anybody name one?

                                        3. 2

                                          I would add, prefer POSIX over bash.

                                        4. 18

                                          I checked, and shellcheck (at least the version on my computer) only catches issue #5 of the 5 I list.

                                          1. 14

                                            That’s because the other ones are options and not errors. Yes, typically they are good hygiene but set -e, for example, is not an unalloyed good, and at least some experts argue against using it.

                                            1. 3

                                              Not for lack of trying: https://github.com/koalaman/shellcheck/search?q=set+-e&type=issues

                                              There are tons of pedants holding us back IMO. Yes, “set -e” and other options aren’t perfect, but if you even know what those situations are, you aren’t the target audience of the default settings.

                                            2. 17

                                              I eventually put in the time

                                              Yup, that’s how you do it, It’s a good idea to put in the the time to understand shell scripting. Most of the common misconceptions come out of misunderstanding. The shell is neither fragile (it’s been in use for decades, so it’s very stable) nor ugly (I came from JavaScript to learning shell script, and it seemed ugly indeed at first, now I find it very elegant). Keeping things small and simple is the way to do it. When things get complex, create another script, that’s the UNIX way.

                                              It’s the best tool for automating OS tasks. That’s what it was made for.

                                              +1 to using ShellCheck, I usually run it locally as

                                              shellcheck -s sh
                                              

                                              for POSIX compliance.

                                              I even went as far as generating my static sites with it https://mkws.sh/. You’re using the shell daily for displaying data in the terminal, it’s a great tool for that, why not use the same tool for displaying data publicly.

                                              1. 6

                                                No, it really is ugly. But I’m not sure why that matters

                                                1. 13

                                                  I believe arguing if beauty is subjective or not is off topic. 😛

                                              2. 16

                                                I went the opposite direction - I was a shell evangelist during the time that I was learning it, but once I started pushing its limits (e.g. CSV parsing), and seeing how easy it was for other members of my team to write bugs, we immediately switched to Python for writing dev tooling.

                                                There was a small learning curve at first, in terms of teaching idiomatic Python to the rest of the team, but after that we had much fewer bugs (of the type mentioned in the article), much more informative failures, and much more confidence that the scripts were doing things correctly.

                                                I didn’t want to have to deal with package management, so we had a policy of only using the Python stdlib. The only place that caused us minor pain was when we had to interact with AWS services, and the solution we ended up using was just to execute the aws CLI as a subprocess and ask for JSON output. Fine!

                                                1. 15

                                                  I tend to take what is, perhaps, a middle road. I write Python or Go for anything that needs to do “real” work, e.g. process data in some well-known format. But then I tie things together with shell scripts. So, for example, if I need to run a program, run another program and collect, and then combine the outputs of the two programs somehow, there’s a Python script that does the combining, and a shell script that runs the three other programs and feeds them their inputs.

                                                  I also use shell scripts to automate common dev tasks, but most of these are literally one-ish line, so I don’t think that counts.

                                                  1. 2

                                                    This makes sense to me

                                                  2. 8

                                                    we immediately switched to Python for writing dev tooling.

                                                    FWIW when shell runs out of steam for me, I call Python scripts from shell. I would say MOST of my shell scripts call a Python script I wrote.

                                                    I don’t understand the “switching” mentality – Shell is designed to be extended with other languages. “Unix philosophy” and all that.

                                                    I guess I need to do a blog post about this ? (Ah I remember I have a draft and came up with a title – The Worst Amounts of Shell Are 0% or 100%https://oilshell.zulipchat.com/#narrow/stream/266575-blog-ideas/topic/The.20Worst.20Amount.20of.20Shell.20is.200.25.20or.20100.25 (requires login)

                                                    (Although I will agree that it’s annoying that shell has impoverished flag parsing … So I actually write all the flag parsers in Python, and use the “task file” pattern in shell.)

                                                    1. 2

                                                      What is the “task file” pattern?

                                                      1. 5

                                                        It’s basically a shell script (or set of scripts) you put in your repo to automate common things like building, testing, deployment, metrics, etc.

                                                        Each shell function corresponds to a task..

                                                        I sketched it in this post, calling it “semi-automation”:

                                                        http://www.oilshell.org/blog/2020/02/good-parts-sketch.html

                                                        and just added a link to:

                                                        https://lobste.rs/s/lob0rw/replacing_make_with_shell_script_for

                                                        (many code examples from others in that post, also almost every shell script in https://github.com/oilshell/oil is essentially that pattern)

                                                        There are a lot of names for it, but many people seem to have converged on the same idea.

                                                        I don’t have a link handy not but Github had a standard like this in the early days. All their repos would have a uniform shell interface so that you could get started hacking on it quickly.

                                                  3. 5

                                                    You should investigate just for task running. It’s simple like make but none of the pitfalls of it for task running.

                                                  1. 14

                                                    I’ve always had the privilege of working places where I am not responsible for more code than I can maintain, which is why I’m always shocked at the amount of people in the opposite situation: having more code than you can maintain. I’m not referring to Python itself (inheriting a decades-old open source project is significantly different). I’m just referring to *users *of python who beg and plead for backwards-compatibility to never be broken so they don’t have to ever change their code. Just change it! That’s your job!

                                                    If you have leadership who is refusing to hiring extra devs to save you from drowning under the burden of maintenance, that’s one thing. But IMO it’s just bad engineering to ignore the costs of maintenance, including maybe having to change an import or a function name before upgrading. And to offload the consequences of that mistake onto the open-source community really grinds my gears.

                                                    1. 10

                                                      I’m happy to see more writing on this subject. I’ve seen way bigger fights fought over way smaller violations of “best practices”.

                                                      My most recent example was submitting a pull request with two commits: a documentation change and gitignoring a single file generated by the documentation. I was forced to create a separate pull request for the .gitignore change, in blind pursuit of the best practice that “unrelated” changed should aways be in separate pull requests.

                                                      1. 3

                                                        What benefit do people who want every separate change think they get out of putting them in separate PRs? I could understand wanting them in different commits (though that’s still tenuous at best), but PRs?

                                                        1. 4

                                                          Probably their merge-to-main workflow squashes by default, so each PR becomes one commit.

                                                          1. 5

                                                            That was the case. They mentioned wanting the ability to revert.

                                                            However 1) they had never reverted a commit in the repo’s 5+ year history, 2) it’s extremely unlikely that change would need to be reverted, and 3) it was a single line change; changing one line is arguably easier than finding the commit SHA to pass to “git revert”.

                                                            But the really frustrating thing is that arguing against this kind of purism is a negative sum game. Since the time savings we’re arguing over are so tiny, even winning the argument (not having to create a separate PR) isn’t worth the ~half hour it would take to convince a purist to lighten up. I have to imagine a there are other “best practices” out there that are also pointless, but just too inconsequential to argue against.

                                                            1. 2

                                                              This is often the case when someone with little real world experience starts writing down policy and misses what is important and what is not. Some things matter a lot (keeping your build system easy to set up and simple to reason about is worth a lot of effort). Some things don’t (who cares about code formatting style so long as someone picks a vaguely sane style and everyone runs an autoformatter with that style).

                                                              1. 2

                                                                I believe that the big benefit from autoformatters is that they move arguments about formatting style out of individual PRs into PRs against the autoformatter config. In individual PRs they can invisbly add up to a lot of time wasted. Endless arguments on the autoformatter’s config are easier to see and shut down.

                                                        2. 1

                                                          But these changes are related??

                                                        1. 2

                                                          The memory space or filesystem of the process can be used as a brief, single-transaction cache. For example, downloading a large file, operating on it, and storing the results of the operation in the database. The twelve-factor app never assumes that anything cached in memory or on disk will be available on a future request or job[.]

                                                          When a participant starts responding to a message, they open a WebSocket connection to the server, which then holds their exercises in the connection handler. These get written out in the background to BigTable so that if the connection dies and the client reconnects to a different instance, that new instance can read their previous writes to fill up the initial local cache and maintain consistency.

                                                          Sounds like they are still following the rules by not relying on the state hehe

                                                          I’m not surprised they had to. You can certainly run stateful things in Kubernetes, but the ease at which you can roll out new versions of containers means restarts are common. And even when running multiple replicas, restarts still kill open connections (terminationGracePeriod can help but still has limits).

                                                          1. 3

                                                            Well, you’re right, we’re kind of in the middle: we rely on per-connection state, but we don’t rely on it existing for a long time after the connection. We wanted to go there, too, but sticky routing was unfortunately not feasible for us.

                                                          1. 10

                                                            Having seen quite a few Golang code bases from the Kubernetes ecosystem, I can tell you that Golang doesn’t magically make that kind of code less complex.

                                                            One example is: with goroutines having first-class support, that style of concurrency gets ham-fisted everywhere, even if multiprocessing, async/await, or other concepts would have been better.

                                                            It’s almost like blindly striving for “simplicity” above all is just bad engineering. It’s a resource to balance like everything else.

                                                            1. 2

                                                              It’s a resource to balance like everything else.

                                                              Thank you! This is something that I find is missing from many discussions about the value of x language or tool. Whether choosing your tools or writing code, there are always trade-offs to be made, and choosing carefully for your use case is going to make for the best results.

                                                              That’s one reason I like to read opinion pieces along these lines, because they may help me make better-informed choices: I can’t say this particular piece did much for me, though. I disagree with its assumptions and didn’t see any effort made to change my mind with evidence.

                                                            1. 3

                                                              I’m trying to figure out what caused that Kubernetes issue. Kubernetes without the control plane is allegedly like a server without a sysadmin: everything stays static (no new containers started, no routes updated, secrets/configmaps not updated). If a worker node can’t talk to the control plane, it won’t just kill all the containers by default.

                                                              However, I wouldn’t be surprised if some Kubernetes add-on breaks that assumption. The side effect of so many people using managed Kubernetes clusters is that so few people have full-blown control plane outages like the one in this post. Thus, the disaster recovery behavior never gets tested enough. I’ve had just two power outages and discovered tons of bugs and documentation gaps across the Kubernetes ecosystem each time.

                                                              1. 21

                                                                As someone who gets pinged every couple months to prep a library release, I really like standard libraries that are large and featureful:

                                                                • can get more consistent releases
                                                                • dependents can rely on each other (no more wasting time downloading 50 one-liners)
                                                                • less likely to have issues with maintainers dissappearing off the face of the earth
                                                                • featureful libraries can all depend on baseline tools to work (you can write a nice HTTP request API without having to deal with all the details of network connections)

                                                                Sometimes libraries have hard to grok bugs where you just need the expertise, and people don’t have it. But soooo many library things are just like “Django slightly changed its internal API”, “this tooling config just needs a tweak”, or “this function in utils.py has a divide-by-zero bug that is easily fixable and an obviously correct fix”.

                                                                There’s the community aspect (like if the standard lib is managed super conservatively getting changes merged in would be tough), but I think there have been 20+ years of experimentation on this front, and a new language could offer a very nice standard library, if people are a bit more flexible on backwards compat etc.

                                                                (standard-contrib might be a good alternative too, where things are not in the never-changing standard library but at least are managed by a large group of people and are recognized to be kinda fundamental packages that should work and get prompt fixes)

                                                                1. 7

                                                                  I like the idea of a “standard-contrib”. Another issue with not having a big standard library is the one affecting the JavaScript ecosystem: too many micro-libraries owned by too many individuals with too little oversight. A standard-contrib frees you from being tied to the language, but puts all the eggs into one well-watched basket.

                                                                  1. 1

                                                                    Nim has a concept of about 100 “important packages” in its github CI. It tests the packages’ tests right away, so as to catch compatibility problems going into the compiler/stdlib.

                                                                    I would say it also has a larger than average standard library..random numbers, pegs, various SQL bindings, off the beaten path data structures like critbit trees (tries-in-a-BST style), etc., etc. It is at least big enough that the average transitive closure of package dependency graphs is quite small - maybe 0-15 other packages. Nevertheless, Nim also struggles with how to grow its stdlib.

                                                                  2. 6

                                                                    I like a large standard library at the programming end of things, too. I think standard libraries are one of those things where a middle ground results in nothing but frustration: either you go all in and provide everything, including two types of kitchen sinks (Python), or you take the barebones approach and provide nothing but a small, extensible interface (Lua).

                                                                    The middle ground looks good on paper and it’s probably fine when you’re hacking on small projects in your spare time, or doing large, greenfield, in-house projects where you actually get to write half the things you use anyway. But on large, long-lived projects, this is extraordinarily frustrating. I’ve spent a disappointing amount of time on things like bickering over code reviewing my way into “the right way” to do something completely inconsequential. A large standard library provides “the right way” for all those tiny things like the correct and idiomatic way to read a six-line YAML file, things that are completely irrelevant on computers from this century. That “right way” is most likely going to be inadequate for working with a sixty billion-line YAML file because that’s just how large standard libraries are. But I really don’t want to solve the “what if that file is sixty bilion lines???” problem every time I read six lines of boilerplate.

                                                                    On an average day I think I hate about half of Python’s standard library but I’ll take a clunky API that lets me write useful code once over a beautiful, constantly-evolving API (usually with mediocre implementations because you don’t get really good implementations in six months) that means I have to rewrite useful code once a year.

                                                                    Both approaches – “everything but the kitchen sink” and “almost nothing” – guarantee that, albeit in different ways (there’s a standard API for everything vs. you get to write your own standard API for everything).

                                                                    The middle ground lets the tech evangelists, language fanboys and resume padders roam free, and you get all sorts of “valuable” innovation in extraordinarily relevant areas like the right way to fetch a response over HTTP. I’m all for deferring to a flexible, powerful library written by someone who’s an expert on that if I have to do like a million of them a second, and obviously, since that’s a challenging problem, the state of the art in solving it is in constant flux and it makes sense that the API for it would be in constant flux. But radical advances in the state of the art of “how to do a HTTP GET once in a blue moon” is really not something I want to change working code over. I’d much rather spend time giving my users cool new things than rewording old code so as to do exactly the same thing.

                                                                    1. 5

                                                                      There’s one other benefit to a good standard library: consistency. OpenStep is my go-to example of a well-designed standard library (technically, Objective-C has no standard library but OpenStep is the de-facto one). It uses a small handful of design patterns and applies them consistently. It uses consistent naming for classes, protocols, methods, fields, and arguments. If you know some part of the library, learning any other part is trivial.

                                                                      This level of consistency is difficult to achieve in libraries maintained by different sets of people. For example, Boost is a forest of different trees, each of which is internally consistent and may contain some common dependencies on other bits of boost, but which use different conventions to other bits. When bits of Boost are promoted to the standard library (the stated goal of Boost is a playground for prototyping future bits of the C++ standard library) then their interfaces are usually tweaked to make them more consistent with the rest of the standard library. The conventions in the C++ standard library are awful (exactly the same naming rules are applied to methods, classes, and fields. Methods have names like empty that are both verbs and nouns and you have to know which is meant, there are lots of short names like rend that you have to know mean a reverse iterator to the end and not the English word rend), but at least they’re the same awful throughout the standard library.

                                                                    1. 1

                                                                      Aside from gVisor’s memory and networking performance, seemed like everything had comparable performance to the “host” (which I assume is bare-metal).

                                                                      I gotta send this to people when they demand bare-metal machines for “performance”. You’re running a <10RPS web app that spits out json from a database; you don’t have some hand-crafted assembly that requires you to care about cache levels/core affinity/noisy neighbors, etc.

                                                                      1. 8

                                                                        Designate a place in your home or a friend/family home where important documents are stored, put them in a waterproof and fireproof safe and include your 2FA backup codes. It’s significantly harder to lose a safe than lose a sheet of paper and the safe provides resilience against fire and flood, both of which would destroy any computer storage that could contain these codes otherwise.

                                                                        1. 15

                                                                          The “waterproof AND fireproof” point is important. Fires tend to be fought with lots and lots of water, but many fireproof safes that you can buy at the hardware store are only fireproof. I’ve made that mistake myself.

                                                                        1. 9

                                                                          For less-important sites, I’d just put backup codes in a password manager. Then you only need to print backup codes for a few critical services like email or the password manager itself.

                                                                          Totally guessing here, but I feel like the likelihood of different attacks is something like: broken authentication on the site > password breach > SIM swap attack >>> someone breaks into your password manager. So the common attacks are either unavoidable (like if they only support SMS 2FA) or you’re protected no matter where your backup codes are stored.

                                                                          1. 1

                                                                            Most 2fa secrets, like for Google authenticator, are stored on the server, so I think your second category is the same as the last category for 2fa. They would then have to break the password hash, but you’re pretty exposed at that point.

                                                                            Do dark web data dumps contain 2fa secrets ever? Or they just mark the password as bad if they have a 2fa?

                                                                          1. 4

                                                                            I’m afraid it’s perfectly possible to ship one version of your code to GitHub and a different version to npm.

                                                                            Is anyone aware of any cool cryptography that might help assert code on Github matches the code on NPM? Or asserting code on Github matches code running on a server?

                                                                            1. 3

                                                                              If the builds are reproducible, you can, by simply comparing your own build results (assuming you trust your own compiler chain, and with webpack having 780 dependencies, I’m not so sure about that) and the ones you get from NPM. I’m not sure how viable setting up reproducible builds is in JS ecosystem though.

                                                                              1. 2

                                                                                This is what Code Transparency is intended to do. You run your builds in a known environment in a confidential computing environment that gives you an attestation over the initial state of the environment (VM image contents plus git hash of the repo). The VM is configured so that the only thing it will do is your build (no accepting incoming connections, for example) and produce the output. The output is signed with a key derived from the attestation. You then append this hash into a public ledger. Anyone else can then audit your build system and VM image and can verify that running the same build gives the same output.

                                                                                Whether anyone will actually do that auditing is an open question. If your VM image has a custom container with a load of stuff that you’ve built then that’s all in the TCB and who knows what’s going on there.

                                                                                This is all predicated on reproduceable builds, of course. If running the same build twice gives different output then there’s no automated way of verifying the process.

                                                                                1. 2

                                                                                  Cryptography can make it easier to simply trust an authority’s signature on a compiled blob, but besides that, I don’t think it will help much. You can certainly download all the source code for each module and run the build for each module yourself. To be honest I don’t know if npm or yarn support that type of operation out of the box. Something tells me even if they did, it would probably fail on many packages for unforeseen reasons; you would have to fix a bunch of “broken builds”.

                                                                                  I do feel like this is kind of silly though. This isn’t a problem unique to JavaScript and npm. Will say regarding javascript and analytics however; IMO malicious or “criminal” intent is not even required for these kinds of “bad things” to happen. It can also happen straight through the front door without having to hide anything, without doing all this silly malware style stuff like changing the behavior based on the time of day.

                                                                                  1. 1

                                                                                    Plugging my own project, but: the fact that we don’t have a good answer to this is why I made xray.computer which lets you view the code that’s actually published to npm. It’s only a plaster on a major wound, though – npm packages often include pre-minified/compiled code, which is what your project will actually use, and it’s almost impossible to review since it’s essentially obfuscated.

                                                                                    Reproducible builds and/or a cryptographically signed web of trust are the best proposals I know of, though it’s worth noting that npm does appear to do some relatively advanced work on detecting malicious packages behind the scenes.

                                                                                    1. 1

                                                                                      I’ve heard of https://github.com/in-toto/demo and was able to find it after I made this comment. Haven’t used it, but seems interesting. Apparently datadog uses it: https://www.datadoghq.com/blog/engineering/secure-publication-of-datadog-agent-integrations-with-tuf-and-in-toto/

                                                                                      1. 1

                                                                                        Related: it seems that something fishy is going on with the nose package on PyPi: https://zaitcev.livejournal.com/263602.html

                                                                                        The code for [nose] in its source repository was fine, only PyPI was bad.

                                                                                        […]

                                                                                        [The maintainers] disclaimed any knowledge of what went on.

                                                                                        So, an unknown entity was able to insert a certain code into a package at PyPI, and pip(1) was downloading it for years. This only came to light because the inserted code failed on my Fedora test box.

                                                                                        1. 1

                                                                                          The author provides no details of what difference existed between what they saw in the repo and what they saw in the package, what effect it actually had, or anything else that would allow readers to verify or investigate for themselves, and simply declares the only possible conclusion to be that PyPI itself has been silently and completely compromised. Not that a maintainer’s account for an unmaintained (coming up on 7 years since last release) package could have been. Not that it’s possible the last release legitimately wasn’t generated from what the author saw in the GitHub repo. Not any of the other many possible explanations. And I haven’t seen this allegedly huge “PyPI compromise” mentioned anywhere else despite that post being over a week old.

                                                                                          Smells an awful lot like FUD, and until the author chooses to provide some more information that will be my stance.

                                                                                      1. 10

                                                                                        Names in databases really do suck. While we’re on the subject of broken schemas, I have a bone to pick with addresses too.

                                                                                        People who live in apartments frequently have to fill out the “address line 2 field” for an apartment/suite number, but some sites expect to jam it all into “address line 1” and will complain that it’s “too long”. I’ve also seen sites that will only let you specify a single number for the suite/unit when it should be freeform. Some sites will also force everything to be all caps, which isn’t more correct than what I input. Even USPS does some of this stuff, as well as classifying addresses into “commercial” addresses, then banning you from submitting an address change request if they think you’re moving to a so-called commercial address.

                                                                                        The worst sites are the ones that refuse to accept an address unless accepted by some address validation service. If you live in a new location, a location that was recently annexed into a city, a location that’s just weird enough (aka not a single family home in a rich suburb) to not be in the system, you’re screwed. Only use address validators if you let users override them. I know where I live, damn it!

                                                                                        And that’s just for places I’ve lived personally. There’s probably a bunch more issues for rural areas and international addresses.

                                                                                        1. 8

                                                                                          For a few years I lived in an apartment building that had retail on the first floor, with an entrance for each. If a delivery went to the wrong door it usually was lost or stolen, but specifying the correct door went on the second line of the address. USPS always saw it, but the harried, closely-monitored package delivery services almost never did. So I’d swap to put that on the first line, which is invalid but always worked. Except when blocked by a site that required using an address their “validator” approved of. It’s very frustrating to have a site tell me I don’t know my own address.

                                                                                            1. 2

                                                                                              Until recently we didn’t have postcodes in Ireland. It was always a fun dance trying to figure out which combo of “”, “NA”, “ N/A”, “000”, “ 000000” etc would be accepted.

                                                                                              And then of course the package would show up on your door with the nonsense postcode proudly printed on it.

                                                                                              1. 2

                                                                                                The UK is interesting for this type of thing because UK addresses basically break almost every assumption about addresses. Houses without a number and houses on streets that don’t have a name are quite common. Another thing that annoys me to no end, systems insisting that the country has to be “United Kingdom” when it’s almost always better to put the actual country in the country field (e.g. England, Scotland, etc).

                                                                                                1. 1

                                                                                                  The nice thing is that house name or number + postcode always uniquely identifies an address in the UK. Unfortunately, a lot of things don’t take advantage of this. My address is {number}, {road name}, {postcode} but there are two blocks of flats on the same road that have the address {number}, {block of flats name}, {road name}, {postcode}, where the number and road name are the same as my address but the postcode is different. Delivery drivers (especially food delivery ones) often come to my house when they’re aiming for one of the flats.

                                                                                                2. 2

                                                                                                  Validators are such a pain. A store I go to always asks for a phone number for their rewards account. In the last few months, some idiot decided they should validate all of the phone numbers in their database…

                                                                                                  Of course, the phone number I’ve used for seven years doesn’t validate. Along with hundreds of other people who shop there.

                                                                                                1. 2

                                                                                                  Having read through this post, this entire comment section, and the one on Orange Site, I guess I have to admit that there really isn’t a perfect solution for automated internal TLS.

                                                                                                  Let’s Encrypt or any other ACME-compatible CA requires public HTTP access or public TXT records. Public HTTP is a non-starter, and BIND doesn’t have a concept of “private A record, but public TXT record” AFAIK. So you have to either expose your RFC1918 IPs in DNS records (bad practice) or not have TLS.

                                                                                                  A traditional CA like DigiCert will let us generate valid certs without any kind of DNS/HTTP challenge, for a price. Which is so frustrating because 1) why don’t they need all the extra validation that ACME does, and 2) feels like the price increases and yearly fees are approaching pure rent-seeking, profiting off the billions of EoL devices that will simply never get more CAs in their trust store.

                                                                                                  And an internal CA has challenges too. Sure you can push the CA cert to all employee laptops pretty easily, but how about every VM? Every container? Every VM/container on every laptop? Every IoT device in all conference rooms? Every virtual appliance where you can’t even SSH in without typing some secret code into an interactive session? If you can’t do all that, can you afford to have every dev independently set that up in every container/VM they run? They’d likely prefer to just set skipTLSVerify at that point.

                                                                                                  I saw mentions of “name constrained” CAs which may help the problem, but please, someone save me from the TLS prison.