1. 4

    Did you consider using Ragel and decide re2c was better? Curious to hear your comparisons if so.

    1. 3

      Good question, although I don’t have a great answer.

      I didn’t really consider Ragel. I had heard about it a long time ago reading Zed Shaw’s posts about its use parsing HTTP in the Mongrel web server.

      It didn’t really cross my mind when thinking about Oil. I encountered re2c in the lexer for the Ninja build system:

      https://github.com/ninja-build/ninja/blob/master/src/lexer.in.cc

      IIRC there was a commit that indicated that the generated code is faster than hand-written code. They started with a hand-written lexer. Also, Ninja is very performance sensitive, e.g. they optimized the heck out of it for incremental builds, which involves parsing big auto-generated Ninja files.

      I looked at the code and I liked how it was written, so I modelled the original Oil lexer in C++ after it.


      Looking at Ragel now, it looks very similar to Ninja. The applications seem to be more network-oriented than programming language-oriented, but I don’t think that’s fundamental.

      re2c can also execute semantic actions in the host language with {} when it matches a regex. So they look very similar to me.

      However the latest I heard about Ragel is that it was involved in Cloudbleed, though perhaps not at fault. I think there is an API design issue. And also it seems to be somewhat commercial and developed by a single person company?

      http://www.colm.net/news/

      I don’t feel a need to look into Ragel more, since I already have working re2c code, but I’m definitely interested if anyone else has experience.

      FWIW I read this paper on re2c, although I believe it’s changed a lot in the last 25 years!

      http://re2c.org/1994_bumbulis_cowan_re2c_a_more_versatile_scanner_generator.pdf

      Also, on the re2c website, it says PHP uses it… but I haven’t looked at the code to see where.

      1. 3

        ragel is good software, it has had some history of being closed off, then reopened as the maintainer tried to find a good way to profit from his work.

        The cloud bleed problem wasn’t really ragel’s fault, but the maintainer made a patch regardless to make it even less likely.

    1. 12

      I’m going to be at Strange Loop & PWLConf this week!

      I did more work on theft this weekend: I’m making progress on adding multi-core support (what’s faster than generating and running tests on 1 core? doing it on 40 cores in parallel!). I got theft running property tests on its own concurrent planner/scheduler, which found a couple subtle bugs by generating arbitrary event interleavings.

      Work: Writing code to compile a huge graph data structure to generated C code. While I can’t say anything about what it represents, I seem to have discovered some pathological cases in clang related to large switch/case bodies. In particular, changing the formatting was the difference between building in 3m45sec and 7h20m. When my main work is done, I want to write a script that generates similarly structured (but IP-free) C code and post a bug report for clang.

      1. 4

        changing the formatting was the difference between building in 3m45sec and 7h20m.

        What on earth? I thought I knew how compilers worked.

        1. 3

          The parser is often the slowest pass - it’s the only one that traverses every single byte of the input.

          I’d love to hear more about what happened here.

          1. 6

            Parsing isn’t the problem – while it will be clearer when I get around to writing that generator script (which may not be for a bit), the general idea is this:

            Slow version: There’s a switch-case statement, which has about 100,000 cases. Each of the case bodies is braced (i.e., has no variables that escape the scope), calls a function with an ID and a bool that sets or clears one or more flag bits on a struct, and a subset of the case bodies end in a goto that jumps to an earlier case body. (The rest end in a break; there are no cycles or fall-throughs.) The generated .c file is about 125 MB. This version took over 7 hours to compile (on a new MBP laptop).

            Medium-fast version: There are no gotos, each case body calls a function named “case_X” for each case body that it would previously goto, walking up the chain of earlier case bodies (i.e., there is duplication). There are forward references to every function (static) generated at the beginning of the file, the functions all appear after the huge switch-case function. This version took about 75 minutes to compile, and is about 300 MB. (Parsing isn’t the problem!)

            Fast version: Same as the medium fast version, except the switch-case body is broken up into several functions, and there’s a wrapper that says if (id < 10000) { return switch_fun_lt_10000(id); } else if (id < 20000) { return switch_fun_lt_20000(id); } else ... up to 100,000. (This could be broken up into a binary tree, but cache locality probably makes up for it. I’ll benchmark later.) I shouldn’t need to do that, but this version only takes 3m45sec to compile. Reducing the number of cases in the switch statement helps quite a bit.

            Something I haven’t tried yet is just indexing into a numeric array of function pointers; it should be functionally identical. I started using switch-case because it’s more declarative, and should have the same result.

            There seems to be something super-linear (> O(n)) in the handling of switch-case (probably some sort of analysis, checking everything against everything else), and it gets noticeably worse as the number of case labels increases past 10,000 or so. Adding gotos between them adds more burden to the analysis. It should be able to rule out interactions between them, though.

            1. 1

              That’s fascinating. I look forward to the bug thread when they find the cause. :)

      1. 1

        I’m getting:

        osh: Modules/main.c:347: Ovm_Main: Assertion `sts != -1' failed.
        Aborted (core dumped)
        
        1. 1

          Oh are you on Arch Linux? 2 other people reported that but I haven’t been able to reproduce.

          Can you try this?

          $ OVM_VERBOSE=1 osh  -c 'echo hi' 2>log.txt
          hi
          

          And then send me log.txt.

          I would appreciate if you would post the report here: https://github.com/oilshell/oil/issues/16

          Mine looks like this:

          $ head log.txt 
          ovm_path = ovm_path_buf (/usr/local/bin/oil.ovm)
          ovm_path: /usr/local/bin/oil.ovm
          # installing zipimport hook
          import zipimport # builtin
          # installed zipimport hook
          # zipimport: found 234 names in /usr/local/bin/oil.ovm
          import _codecs # builtin
          import codecs # loaded from Zip /usr/local/bin/oil.ovm/codecs.pyc
          import encodings.aliases # loaded from Zip /usr/local/bin/oil.ovm/encodings/aliases.pyc
          import encodings # loaded from Zip /usr/local/bin/oil.ovm/encodings/__init__.pyc
          
        1. 37

          The “downsides” list is missing a bunch. I mean, I use Makefiles too, probably too much, but they do have some serious downsides, e.g.

          • The commands are interpreted first by make, then by $(SHELL), giving some awful escaping at times
          • If you need to do things differently on different platforms, or package things for distros, you pretty quickly have to learn autoconf or even automake, which adds greatly to the complexity (or reinvent the wheel and hope you didn’t forget some edge-case with DESTDIR installs or whatever that endless generated configure script is for)
          • The only way to safely (e.g. parallelizable) do multiple outputs is by using the GNU pattern match extension, which is extremely limited (rules with multiple inputs to multiple outputs is hard to write without lots of redundancy)
          • GNU make 4 has different features from macos (pre-GPL3) make 3.8 has different features from the various BSD makes
          • You really have to understand how make works to avoid doing things like possibly_failing_command | sed s/i/n/g > $@ (which will create $@ and trick make into thinking the rule succeeded because sed exited with 0 even though the first command failed). And do all your devs know how to have multiple goals that each depend on a temp dir existing, without breaking -j?

          and there’s probably lots more. OTOH, make been very useful to me over the years, I know its quirks, and it’s available on all kinds of systems, so it’s typically the first thing I reach for even though I’d love to have something that solves the above problems as well.

          1. 14

            Your additional downsides makes it sound like maybe the world needs a modern make. Not a smarter build tool, but one with less 40-year-old-Unix design sensibilities: a nicer, more robust language; a (small!) handful of missing features; and possibly a library of common functionality to limit misimplementations and cut down on the degree to which every nontrivial build is a custom piece of software itself.

            1. 7

              mk?

              1. 3

                i’ve also thought of that! for reference: https://9fans.github.io/plan9port/man/man1/mk.html

              2. 10

                I think the same approach as Oil vs. bash is necessary: writing something highly compatible with Make, separating the good parts and bad parts, and fixing the bad parts.

                Most of the “make replacements” I’ve seen make the same mistake: they are better than Make with respect to the author’s “pet peeve”, but worse in all other dimensions. So “real” projects that use GNU Make like the Linux kernel and Debian, Android, etc. can’t migrate to them.

                To really rid ourselves of Make, you have to implement the whole thing and completely subsume it. [1]

                I wrote about Make’s overlap with shell here [2] and some general observations here [3], echoing the grandparent comment – in particular how badly Make’s syntax collides with shell.

                I would like for an expert in GNU Make to help me tackle that problem in Oil. Probably the first thing to do would be to test if real Makefiles like the ones in the Linux kernel can be statically parsed. The answer for shell is YES – real programs can be statically parsed, even though shell does dynamic parsing. But Make does more dynamic parsing than shell.

                If there is a reasonable subset of Make that can be statically parsed, then it can be converted to a nicer language. In particular, you already have the well-tested sh parser in OSH, and parsing Make’s syntax 10x easier that. It’s basically the target line, indentation, and $() substitution. And then some top level constructs like define, if, include, etc.

                One way to start would be with the “parser” in pymake [4]. I hacked on this project a little. There are some good things about it and some bad, but it could be a good place to start. I solved the problem of the Python dependency by bundling the Python interpreter. Although I haven’t solved the problem of speed, there is a plan for that. The idea of writing it in a high-level language is to actually figure out what the language is!

                The equivalent of “spec tests” for Make would be a great help.

                [1] https://lobste.rs/s/ofu5yh/dawn_new_command_line_interface#c_d0wjtb

                [2] http://www.oilshell.org/blog/2016/11/14.html

                [3] http://www.oilshell.org/blog/2017/05/31.html

                [4] https://github.com/mozilla/pymake

                1. 6

                  Several more modern make style tools exists - e.g. ninja, tu and redo.

                  1. 2

                    We need a modern make, not make-style tools. It needs to be mostly compatible so that someone familiar with make can use “modern make” without learning another tool.

                    1. 8

                      I think anything compatible enough with make to not require learning the new tool would find it very hard to avoid recreating the same problems.

                  2. 2

                    The world does, but

                    s/standards/modern make replacements/g

                  3. 5

                    Do most of these downsides also apply to the alternatives?

                    The cross platform support of grunt and gulp can be quite variable. Grunt and gulp and whatnot have different features. The make world is kinda fragmented, but the “not make” world is pretty fragmented, too.

                    My personal experience with javascript ecosystem is nil, but during my foray into ruby I found tons of rakefiles that managed to be linux specific, or Mac specific, or whatever, but definitely not universal.

                    1. 5

                      I recommend looking at BSD make as its own tool, rather than ‘like gmake but missing this one feature I really wanted’. It does a lot of things people want without an extra layer of confusion (automake).

                      Typical bmake-only makefiles rarely include shell script fragments piping output around, instead they will use ${VAR:S/old/new} or match contents with ${VAR:Mmything*}. you can use ‘empty’ (string) or (file) ‘exists’.

                      Deduplication is good and good mk fragments exist. here’s an example done with bsd.prog.mk. this one’s from pkgsrc, which is a package manager written primarily in bmake.

                      1. 2

                        Hey! Original author here :). Thanks a bunch for this feedback. I’m pretty much a Make noob still, so getting this type of feedback from folks with more experience is awesome to have!

                        1. 2

                          You really have to understand how make works to avoid doing things like possibly_failing_command | sed s/i/n/g > $@ (which will create $@ and trick make into thinking the rule succeeded because sed exited with 0 even though the first command failed).

                          Two things you need to add to your Makefile to remedy this situation:

                          1. SHELL := bash -o pipefail. Otherwise, the exit status of a shell pipeline is the exit status of the last element of the pipeline, not the exit status of the first element that failed. ksh would work here too, but the default shell for make, /bin/sh, won’t cut it – it lacks pipefail.
                          2. .DELETE_ON_ERROR:. This is a GNU Make extension that causes failed targets to be deleted. I agree with @andyc that this behavior should be the default. It’s surprising that it isn’t.

                          Finally, for total safety you’d want make to write to .$@.$randomness.tmp and use an atomic rename if the rule succeeded, but afaik there’s no support in make for that.

                          So yes, “you really have to understand how make works [to avoid very problematic behavior]” is an accurate assessment of the state of the things.

                          1. 1

                            Your temp directories dependency problem makes me think a GUI to create and drag drop your rules around could be useful. It could have “branching” and “merging” steps that indicate parallelism and joining too.

                          1. 6

                            I upgraded from an X1C3 to the WQHD X1C5 a couple months ago and have been really happy with it. Build quality is better overall, and as jcs notes, the display is a major area of improvement. Not just brightness (I can actually see my screen in sunnier settings now!), but the blacks and viewing angles are so much better it almost feels…unthinkpaddy.

                            Other Pros:

                            • Battery life is another huge improvement. I’m getting 12h for normal usage.
                            • Somehow this thing is even lighter (2.5 lbs).
                            • Sturdier feeling, like they crossed an x220 with a previous gen X1 Carbon.
                            • High performance upgrades are available – I got the 1TB NVMe disk.

                            Cons (all minor):

                            • 802.11ac wireless is still flaky for me (on Linux), I’m seeing random disassociations.
                            • When the fans spin up they’re more noticeable than on the X1C3.
                            • The classic USB ports are really grippy, it feels like you have to use too much force.
                            • As usual the internal speakers suck. /shrug
                            1. 2

                              Glad to hear the display is brighter. I returned to the Thinkpad fold after 5 years, and purchased a X1C3. Very satisfied overall, and the only downside that has bothered me regularly is the low maximum screen brightness.

                              1. 1

                                I got an X260 for work and really regret it due to the shoddy brightness and aspect ratio. First time I’ve ever been disappointed with a Thinkpad, but I should have seen it coming. Definitely will be getting an X1 Carbon next time unless the retro model ends up happening before then (and shipping with a bright screen).

                                1. 1

                                  shoddy brightness

                                  Did you get a TN panel? My X240 has an IPS display and it’s excellent.

                                  1. 1

                                    Mine isn’t the IPS panel, but from what I’ve read the IPS they put in the X260 is still only around 300 nits and of course still the wretched 16:9. If it weren’t for the latter I’d consider modding it with a better display, but you can’t fix the aspect ratio, so I just keep it on my desk plugged into a couple better monitors; when I go out and about I just ssh in remotely from my X301.

                            1. 4

                              Thanks, this is a great deep dive. I’m considering adopting your simple style rule (“Do not use anything but the two or three argument forms of [.”)

                              In fact, dash, mksh, and zsh all agree that the result of [ -a -a -a -a ] is 1 when the file -a doesn’t exist, not a syntax error! Bash is the odd man out.

                              I’ve discovered too many bizarre Bashisms to count, and hope you steer oilshell away from Bash behavior emulation and towards the POSIX shell spec. Even Bash’s supposed POSIX compatibility mode has Bashisms poking out. There’s one non-POSIX thing I dearly miss in /bin/sh: set -o pipefail. It’s difficult to write safe shell scripts without it, so much so that it should probably be in the spec.

                              1. 3

                                Thanks. The next post will be about Oil’s equivalents, so I’ll be interested to hear your feedback.

                                You can also just use [[, although it’s less portable. The one thing I don’t like about [[ (besides aesthetics, I prefer test), is that == does globbing, as I mention in the appendix. That should have been a different operator, as =~ is for regexes.

                                Bash and all the shells I’ve tested are more POSIX compatible than I would have thought. (Bash does have a tendency to hide unrelated bug fixes behind set -o posix though.)

                                The bigger issue is that POSIX is not good enough anymore. POSIX doesn’t have set -o pipefail like you say, and it also doesn’t have even have local. For example, there are some Debian guidelines floating around that say use POSIX but add local and a few other things. Human-written scripts can’t get by with strict POSIX. Even echo -- is a problem.

                                This is the motivation behind the “spec tests” in OSH – to discover a more complete spec. Looking at what existing shells do is exactly how POSIX was made, although the process probably wasn’t automated and it was done many years ago.

                                I’m basically implementing what shells agree on. But I do have a bias toward for bash behavior when it’s not ridiculous, because bash is widely deployed. When all shells disagree, you have to pick something, and picking dash or mksh makes no sense. POSIX typically doesn’t say anything at all in these cases, so it’s not much help.

                                1. 2

                                  This is the motivation behind the “spec tests” in OSH – to discover a more complete spec…I’m basically implementing what shells agree on.

                                  While we’re on the subject, here’s a point of disagreement between the shells that you might find interesting: assignment from a heredoc. If the POSIX shell spec has anything to say on this one, I couldn’t find it. Defining these kind of behaviors in a more complete shell spec does seem to me like a very valuable endeavor on its own.

                                  But I do have a bias toward for bash behavior when it’s not ridiculous, because bash is widely deployed. When all shells disagree, you have to pick something, and picking dash or mksh makes no sense. POSIX typically doesn’t say anything at all in these cases, so it’s not much help.

                                  I’d probably fall back on Bourne shell behavior as found in present day BSDs, or the Korn shell, or heck even dash, before going in for an obvious Bashism. Both Bourne and Korn exhibit careful, minimal design. Bash on the other hand was “anything goes” for a while there, with predictable implications for quality and security (“Wouldn’t it be cool if you could export functions to children via the env?!” => shellshock).

                                  But, if it’s a question of how facilities common to all shells should behave, then choosing the Bash behavior isn’t necessarily bad.

                                  1. 1

                                    Yes that’s the kind of thing that I’ve been testing. I copied it into my test framework:

                                    https://github.com/oilshell/oil/commit/a79ebc8437781b8edb8fd8ad03276fc6255af1f3

                                    Here are the results:

                                    http://www.oilshell.org/git-branch/dev/oil4/5ca7bacb/andy-home/spec/blog-other1.html

                                    I put his example as case 0, and his fix as case 1. Interestingly the “before” one works in mksh and zsh, but the “after fix” fails in those two shells.

                                    dash accepts all of them and bash fails at all of them.

                                    Case 2 is my rewrite of this, which works in OSH.

                                    But going even further, I think this construct can always be expressed more cleanly as a separate assignment and then here doc. I did think about this issue, because OSH prints a warning that it’s probably not what you want:

                                    osh warning: WARNING: Got redirects in assignment
                                    

                                    Though I think this example is conflating two issues: the command sub + here doc, and the fact that sed has two standard inputs – the pipe and tr. I didn’t tease those things apart and I suspect that would help reason about this.

                                    It’s definitely interesting but I’m going to leave it for now because it’s not from a “real” script… I still have a lot of work to do on those! But it’s in the repo in case anyone ever hits it.

                                    1. 4

                                      That’s a neat cross-shell test framework.

                                      It’s definitely interesting but I’m going to leave it for now because it’s not from a “real” script…

                                      This was from a real script (that post is from my blog) but after a bit of git grepping I still can’t find it, so how real can it be eh? I agree your rewrite to $(some complex multiline stuff) is cleaner than the same in backticks, but the questions around when a here-document should be interpreted remain.

                                      Though I think this example is conflating two issues: the command sub + here doc, and the fact that sed has two standard inputs – the pipe and tr.

                                      It doesn’t depend on sed actually. Simpler test case:

                                      foo=`cat`<<EOM
                                      hello world
                                      EOM
                                      echo "$foo"
                                      

                                      /bin/dash prints “hello world”, while Bash hangs waiting for input.

                                      1. 3

                                        Oh sorry it wasn’t clear from the blog post where the example came from!

                                        I just tested it out, and the simpler example works on dash, mksh, and zsh, but fails on bash. That is interesting and something I hadn’t considered. Honestly it breaks my model of how here docs are parsed. I wrote about that here [1].

                                        And while you can express this in OSH, it looks like OSH is a little stricter than bash even. So I’ll have to think about this.

                                        Right now I think there are some lower hanging fruit like echo -e and trap and so forth, but these cases are in the repo and won’t be lost.

                                        [1] http://www.oilshell.org/blog/2016/10/17.html

                                  2. 1

                                    Yeah, echo is terrible. When I went through my “learning about shell for real” experiences I ended up decided that printf is the way

                                1. 8

                                  I think the takeaway here is a) don’t confuse all kind of errors with a http request with invalid tokens (I’m not familiar with the Github API, but I suppose it returns 503 unauthorized correctly) and b) don’t delete important data, but flag it somehow.

                                  1. 5

                                    It returns a 404 which is a bit annoying since if you fat finger your URL you’ll get the same response as if a token doesn’t exist.

                                    https://developer.github.com/v3/oauth_authorizations/#check-an-authorization

                                    Invalid tokens will return 404 NOT FOUND

                                    I’ve since moved to using a pattern of wrapping all external requests in objects that we can explicitly check their state instead of relying on native exceptions coming from underlying HTTP libraries. It makes things like checking explicit status code in the face of non 200 status easier.

                                    I might write on that pattern in the future. Here’s the initial issue with some more links https://github.com/codetriage/codetriage/issues/578

                                    1. 3

                                      Why not try to get issues, and if it fails with a 401, you know the token is bad? You can double check with the auth_is_valid method you’re using now…

                                      1. 2

                                        That’s a valid strategy.

                                        Edit: I like it, I think this is the most technically correct way to move forwards.

                                      2. 1

                                        Did the Github API return a 404 Not Found instead of a 5xx during the outage?

                                        1. 1

                                          No clue.

                                          1. 1

                                            Then there’s your problem. Your request class throws RequestError on every non-2xx response, and auth_is_valid? thinks any RequestError means the token is invalid. In reality you should only take 4xx responses to mean the token is invalid – not 5xx responses, network layer errors, etc.

                                            1. 1

                                              Yep, that’s what OP in the thread said. I mention it in the post as well.

                                      3. 2

                                        I think the takeaway is that programmers are stupid.

                                        Programs shouldn’t delete/update anything, only insert. Views/triggers can update reconciled views so that if there’s a problem in the program (2) you can simply fix it and re-run the procedure.

                                        If you do it this way, you can also get an audit trail for free.

                                        If you do it this way, you can also scale horizontally for free if you can survive a certain amount of split/brain.

                                        If you do it this way, you can also scale vertically cheaply, because inserts can be sharded/distributed.

                                        If you don’t do it this way – this way which is obviously less work, faster and simpler and better engineered in every way, then you should know it’s because you don’t know how to solve this basic CRUD problem.

                                        Of course, the stupid programmer responds with some kind of made up justification, like saving disk space in an era where disk is basically free, or enterprise, or maybe this is something to do with unit tests or some other garbage. I’ve even heard a stupid programmer defend this crap because the the unit tests need to be idempotent and all I can think is this fucking nerd ate a dictionary and is taking it out on me.

                                        I mean, look: I get it, everyone is stupid about something, but to believe that this is a specific, critical problem like having to do with 503 errors instead of a systemic chronic problem that boils down to a failure to actually think really makes it hard to discuss the kinds of solutions that might actually help.

                                        With a 503 error, the solution is “try harder” or “create extra update columns” or whatever. But we can’t try harder all the time, so there’ll always be mistakes. Is this inevitable? Can business truly not figure out when software is going to be done?

                                        On the other hand, if we’re just too fucking stupid to program, maybe we can work on trying to protect ourselves from ourselves. Write-only-data is a massive part of my mantra, and I’m not so arrogant to pretend it’s always been that way, but I know the only reason I do it is because I deleted a shit-tonne of customer data on accident and had the insight that I’m a fucking idiot.

                                        1. 4

                                          I agree with the general sentiment. It took me a bout 3 read throughs to parse through all the “fucks” and “stupids”. I think there’s perhaps a more positive and less hyperbolic way to frame this way.

                                          Append only data is a good option, and basically what I ended up doing in this case. It pays to know what data is critical and what isn’t. I referenced the acts_as_paranoid and it pretty much does what you’re talking about. It makes a table append only, when you modify a record it saves an older copy of that record. Tables can get HUGE, like really huge, as in the largest tables i’ve ever heard of.

                                          /u/kyrias pointed out that large tables have a number of downsides such as being able to perform maintenance and making backups.

                                          1. 2

                                            you can do periodic data warehousing though to keep the tables as arbitrarily small as you’d like but that introduces the possibility of programmer error when doing the data warehousing. it’s an easier problem to solve than making sure every destructive write is correct in every scenario though.

                                            1. 1

                                              Tables can get HUGE, like really huge, as in the largest tables i’ve ever heard of

                                              I have tables with trillions of rows in them, and while I don’t use MySQL most of the time, even MySQL can cope with that.

                                              Some people try to do indexes, or they read a blog that told them to 1NF everything, and this gets them nowhere fast, so they’ll think it’s impossible to have multi-trillion-row tables, but if we instead invert our thinking and assume we have the wrong architecture, maybe we can find a better one.

                                              /u/kyrias pointed out that large tables have a number of downsides such as being able to perform maintenance and making backups.

                                              And as I responded: /u/kyrias probably has the wrong architecture.

                                            2. 2

                                              Of course, the stupid programmer responds with some kind of made up justification, like saving disk space in an era where disk is basically free

                                              It’s not just about storage costs though. For instance at $WORK we have backups for all our databases, but if we for some reason would need to restore the biggest one from a backup it would take days where all our user-facing systems would be down, which would be catastrophic for the company.

                                              1. 1

                                                You must have the wrong architecture:

                                                I fill about 3.5 TB of data every day, and it absolutely would not take days to recover my backups (I have to test this periodically due to audit).

                                                Without knowing what you’re doing I can’t say, but something I might do differently: Insert-only data means it’s trivial to replicate my data into multiple (even geographically disparate) hot-hot systems.

                                                If you do insert-only data from multiple split brains, it’s usually possible to get hot/cold easily, with the risk of losing (perhaps only temporarily) a few minutes of data in the event of catastrophe.

                                              2. 0

                                                Unfortunately, if you hold any EU user data, you will have to perform an actual delete if the EU user wants you to delete their stuff if you want to be compliant with their stuff. I like the idea of the persistence being an event log and then you construct views as necessary. I’ve heard that it’s possible to use this for almost everything and store an association of random-id to person, and then just delete that association when asked to in order to be compliant, but I haven’t actually looked into that carefully myself.

                                                1. 2

                                                  That’s not true. The ICO recognises there are technological reasons why “actual deletion” might not be performed (see page 4). Having a flag that blinds the business from using the data is sufficient.

                                                  1. 1

                                                    Very cool. Thank you for sharing that. I was under the misconception that having someone in the company being capable of obtaining the data was sufficient to be a violation. It looks like the condition to be compliant is weaker than that.

                                                    1. 2

                                                      No problem. A big part of my day is GDPR-related at the moment, so I’m unexpectedly versed with this stuff.

                                                2. 0

                                                  There’s actually a database out there that enforces the never-delete approach (together with some other very nice paradigms/features). Sadly it isn’t open source:

                                                  http://www.datomic.com/

                                              1. 11

                                                Nice! I use a similar trick for nginx virtual hosting:

                                                  # Proxy to a backend server based on the hostname.
                                                  if (-d /var/lib/nginx/vhosts/$host) {
                                                    proxy_pass http://unix:/var/lib/nginx/vhosts/$host/server.sock;
                                                    break;
                                                  }
                                                

                                                This lets you add and remove virtual hosts without touching nginx config. Like the bind broker, the /var/lib/nginx/vhosts/$host/ directories also let you apply the unix permissions model.

                                                This maps IP ownership (which doesn’t really exist in unix) to the filesystem namespace (which does have working permissions).

                                                I wish more programs supported unix domain sockets for exactly this reason. Well, that, and because path allocation is easier to coordinate than port allocation.

                                                Then I broke out the big cheat stick and just spliced the sockets together. In reality, we’d have to set up a read/copy/write loop for each end to copy traffic between them. That’s not very interesting to read though.

                                                So, today I learned about SO_SPLICE. Awesome! But why is it necessary to set up a read/copy/write loop? Isn’t that what SO_SPLICE does for you?

                                                1. 4

                                                  The splicing code doesn’t even work on OpenBSD, because it’s not (yet) possible to splice sockets from different domains. I wouldn’t want the answer key to be so easy to steal. Bonus points if you fix the kernel instead.

                                                  1. 2

                                                    Thanks, I missed that bit. In setsockopt(2) it says “both sockets must be of the same type,” but nothing about family, so I assumed any two SOCK_STREAM sockets could be spliced.

                                                1. 15

                                                  Just use runit. It’s dead simple, actually documented, actually used in production, BSD licensed, and so on. I use it on my work computer with no problems. It’s no bullshit, no bloat, you don’t have to “learn runit” to use it and get exactly what you want.

                                                  1. 5

                                                    I’m aware of runit. It does seem pretty nice, but there are a few things about it that bother me. I don’t want to get into specifics here since it can so easily become a matter of one opinion vs the other, but I’ll try to write about some general issues which Dinit should handle well (and which I don’t think runit does) at some point in the near future.

                                                    1. 3

                                                      Well one things that comes to mind is that runit doesn’t deal well (or at all) with (double-)forking services. Those are unfortunate by themselves — I mean, let the OS do its job please! — but still exist.

                                                    2. 3

                                                      I have run into some odd behavior with runit a time or two, somehow managing to get something into a weird wedged state. I could never figure out what the exact problem was (maybe it is fixed by now?). Oddly enough, I never had the same issue with ye olde daemontools.

                                                      Aside from that, I do also like runit – as a non pid 1 process supervisor.

                                                      1. 2

                                                        We use runit heavily at my job. It’s a massive pain to deal with, and we have to use a lot of automation to deal with the incredibly frequent issues we have with it. I would never recommend it to anyone, honestly.

                                                        1. 2

                                                          Examples?

                                                          1. 4

                                                            I’ve mentioned this here: https://lobste.rs/s/2qjf4o/problems_with_systemd_why_i_like_bsd_init#c_8qtwla

                                                            Also, since then, we’ve had problems with svlogd losing track of the process that it’s logging for. Also it should be noted that you absolutely don’t get logging for free, and it requires additional management.

                                                            1. 2

                                                              Runit does have support for dependencies in a way, you put the service start command in the run file and it starts the other service first, or blocks until it finishes starting. Right?

                                                              How does it lose track of its controlled processes? Like do you know what causes it? For example I know runit doesn’t manage double forking daemons.

                                                              What kind of scaffolding have you set up to ensure logging? What breakages do you have to guard against? How do you guard against them?

                                                              Do you know why svlogd loses the process? As I understand, it’s just hooked to stdout/stderr, so how could it lose track? What specific situations does that happen in? How did you resolve?

                                                              I know it’s a lot of questions, but I’m genuinely curious and would love to learn more.

                                                              1. 5

                                                                How does it lose track of its controlled processes? Like do you know what causes it? For example I know runit doesn’t manage double forking daemons.

                                                                The reason runit, daemontools classic, and most other non-pid-1 supervisors “lose track” of supervised processes comes down to the lack of setsid(2). If you have a multiprocess service, in 99% of cases you should create a new process group for it and use process group signaling rather than single process signaling. If you don’t signal the entire process group when you sv down foo, you’re only killing the parent, and any children will become orphans inherited by pid 1, or an “orphaned process group” that might keep running.

                                                                A few years back I got a patch into daemontools-encore to do all of this, although we screwed up the default behavior (more on that in a second). You can read more about the hows and whys of multiprocess supervision in that daemontools-encore PR.

                                                                If you’re using a pid-1 supervisor like BSD init, upstart, systemd, etc it can do something more intelligent with these orphans, since it’s the one that inherits them. Also, pid-1 supervisors usually run services in a new process group by default.

                                                                Now, the screw-up: when we added multiprocess service support to daemontools-encore, I initially made setsid an opt-in feature. So by default, services wouldn’t run in a new process group, which is the “classic” behavior of daemontools, runit, et al. There are a few popular services like nginx that actually rely on this behavior for hot upgrades, or for more control over child processes during shutdown. Unfortunately I let myself get talked out of that, and we made setsid opt-out. That broke some of these existing services for people, and the maintainer did the worst thing possible, and half-backed out multiprocess service support.

                                                                At this point bruceg/daemontools-encore is pretty broken wrt multiprocess services, and I wouldn’t recommend using it. I don’t have the heart to go back and argue with the maintainer that we fix it by introducing breaking behavior again. Instead I re-forked it, fixed multiprocess support, and have been happily and quietly managing multiprocess services on production systems for several years now. It all just works. If you’re interested, here’s my fork: https://github.com/acg/daemontools-encore/tree/ubuntu-package-1.13

                                                                I guess I’ll end with a request for advice. How should I handle this situation fellow lobsters? Suck it up and get the maintainer to fix daemontools-encore? Make my fork a real fork? Give up and add proper setsid support to another daemontools derivative like runit?

                                                                1. 1

                                                                  Thank you for all your answers! Can you comment on the -P flag in runsvdir? Does that not do what you want?

                                                                  1. 4

                                                                    There are several problems with multiprocess services in runit.

                                                                    1. As mentioned above, some services should not use setsid, although most properly written services should. But runsvdir -P is global.

                                                                    2. If you use runsvdir -P, then sv down foo should use process group signalling instead of parent process signalling, or you can still create orphans. As another example, sv stop foo should send SIGSTOP to all processes in the process group, but since it doesn’t, everyone but the parent process continues to run (ouch!). Unfortunately runit entirely lacks facilities for process group signalling.

                                                                    In my patched daemontools-encore:

                                                                    • svc -=X foo signals the parent process only
                                                                    • svc -+X foo signals the entire process group
                                                                    • svc -X foo does one or the other depending on whether you’ve marked the service as multiprocess with a ./setsid file

                                                                    But generally you just use the standard svc -X foo, because it does the right thing.

                                                                    Besides the things mentioned above, runsvdir -P introduces some fresh havoc in other settings. Try this in a foreground terminal:

                                                                    mkdir -p ./service/foo
                                                                    printf '#!/bin/sh\nfind / | wc -l\n' > ./service/foo/run
                                                                    chmod +x ./service/foo/run
                                                                    runsvdir -P ./service
                                                                    ^C
                                                                    ps ax | grep find
                                                                    ps ax | grep wc
                                                                    

                                                                    The find / | wc -l is still running, even though you ^C’ed the whole thing! What happened? Well, things like ^C and ^Z result in signals being sent to the terminal’s foreground process group. Your service is running in a new, separate process group, so it gets spun off as an orphan. The only good way to handle this is for the supervisor to trap and relay SIGINT and SIGHUP to the process groups underneath it.

                                                                    To those wondering who runs a supervisor in a foreground terminal as non-root…me! All the time. The fact that daemontools derivatives let you supervise processes directly, without all that running-as-root action-at-a-distance system machinery, is one of their huge selling points.

                                                                    1. 2

                                                                      Dinit already used setsid, today I made it signal service process groups instead of just the main process. However when run as a foreground process - which btw Dinit fully supports, that’s how I test it usually - you can specify that individual services run “on console” and they won’t get setsid()‘d in this case. I’m curious though as to how running anything in a new session (with setsid) actually causes anything to break? Unless a service somehow tries to do something to the console directly it shouldn’t matter at all, right?

                                                                      1. 2

                                                                        I’m curious though as to how running anything in a new session (with setsid) actually causes anything to break? Unless a service somehow tries to do something to the console directly it shouldn’t matter at all, right?

                                                                        The problems are outlined above. ^C, ^Z etc get sent to the tty’s foreground process group. If the supervisor is running foreground with services under it in separate process groups, they will continue running on ^C and ^Z. In order to get the behavior users expect – total exit and total stop of everything in the process tree, respectively – you need to catch SIGINT, SIGTSTP, and SIGCONT in the foreground process group and relay them to the service process groups. Here’s what the patch to add that behavior to daemontools-encore looked like.

                                                                      2. 2

                                                                        Thanks for the info! I still have a few questions, correlated to your numbered points:

                                                                        1. Like nginx yes? How does a pid 1 handle nginx differently / what makes a pid 1 different? If 99% of stuff needs the process group signaled, but nginx works with pid 1 supervisors, do they not signal the process group? How does all that work? And how does all of this tie in to using runit as a pid 1? Would the problems you have with it not exist for people using it as a pid 1? Because the original discussion was about alternate init systems, which is how I use it.

                                                                        2. This would only create orphans if the child process ignores sighup right? Obviously that’s still a big problem, but am I correctly understanding that? And when runsvdir gets sighup it then correctly forwards sigterm to its children yes? Not as easy as ^C but still possible. Would any of this behavior be different if you were running as root, but still not as pid 1?

                                                        1. 2

                                                          Can anyone who has a Calyx hotspot comment on the experience? I had a Clear Spot for several years, and it was perfect for remote work: slower 3G speeds, but no caps and decent coverage. It was a sad day when Sprint bought Clear and wound them down.

                                                          Since then it’s been virtually impossible to find a slower no-cap data plan. I got a Karma Go, which was unlimited in the beginning, but they’ve since pulled the bait-and-switch to a capped 10GB plan at $75/month. Been pretty unhappy with it.

                                                          1. 17

                                                            I only see the systemd things that make it to lobste.rs, so pretty heavy bias, but I don’t think I’ve read a single one of these bug reports where systemd does something unexpected and horrible where poettering has responded with “yep, this is a bug and sorry we’ll fix it”. It’s just weirdly anti-user. I don’t know if Linux users just don’t care or they’ve just given up after all of the init daemon’s various Linux distros have gone through.

                                                            I wonder what things will be like in 10 years once systemd is everywhere. Maybe people don’t dare touch the default config and just work around issues in odd ways or people are just miserable but need to wait for the next generation to come along to be energetic enough to propose a whole new init system and deploy it.

                                                            1. 8

                                                              I’m pretty sure we’ve given up. Also pretty sure it’ll be a miserable mess of workarounds and hacks.

                                                              Hopefully, then, someone starts advocating for Runit or the suckless people resurrect Uselessd, because there is sanity left in the world. Practically no one uses sanity, though.

                                                              1. 10

                                                                Come to BSD, we don’t suck (yet)!

                                                                1. 1

                                                                  I’d love to use OpenBSD but only a handful of VPS/Server providers support it.

                                                                  1. 6

                                                                    OpenBSD offers nice bsd.rd images which can be loaded into RAM and started with your favourite bootloader.

                                                                    That means you can get yourself any KVM (or similar ‘real’ virtualisation) vps (these can be really cheap, starting at €1/mo or even less), choose a Linux template and manually install OpenBSD pretty quickly.

                                                                    Here’s a nice guide documenting the process. Bear in mind you might need to change the set root=(hd0) in the grub config. For example one some boxes I have I had to use set root=(hd0,msdos1).

                                                                    1. 1

                                                                      Thanks, maybe I’ll try that.

                                                                      My problem is that I want to have something “just work” without fiddling with stuff.

                                                                      I believe OpenBSD itself would certainly qualify, but hosting it is another matter.

                                                                    2. 1

                                                                      Is there anything that keeps you from using one of those handful of providers? As far as I’m aware, they aren’t significantly more expensive than their competition.

                                                                      1. 1

                                                                        I recently tried to get OpenBSD working on Vultr, which is one of them.

                                                                        If I remember correctly, no matter what I did, sudo refused to recognize that a user I added was in the “wheel” group, even though the command to list a user’s groups showed that was the case.

                                                                        It was really strange, and after a few reinstalls etc, I gave up.

                                                                        1. 2

                                                                          I have a few OpenBSD VMs on Vultr and they work fine (aka, Works on My Machine™)

                                                                          Without any more details the problem here could have been anything. Did you log the user out and back in again after changing the group? Maybe you didn’t put a colon in front of the group name in doas.conf? If you were using sudo, did you uncomment the required line in sudoers?

                                                                          1. 1

                                                                            I don’t remember what I did anymore, but I guess I should try again.

                                                                            It would be nice to run OpenBSD for all of my server needs but I can’t tell if it’s a good idea for a business (which I’m building).

                                                                          2. 1

                                                                            forget sudo, use doas!

                                                                            1. 1

                                                                              I may have been trying to - I don’t remember now.

                                                                              1. 1

                                                                                Ah, well, I’m not exactly an openbsd expert but if you want to try again highlight me in #lobsters and I can show you my configuration.

                                                                                1. 1

                                                                                  Thanks :)

                                                                    3. [Comment removed by author]

                                                                      1. 5

                                                                        I use Void for work where I need to use Linux, 10/10 would recommend.

                                                                        For anyone who does use it, the only thing that tripped me up is you need to install git-perl and possibly git-extras in order to get whatever else you normally get when you install git. After learning that, I got used to the extreme package granularity so I don’t remember if it’s that way for anything else, as that’s now always the first thing I check when a command was suspiciously missing.

                                                                        Other than that, long live BSD.

                                                                      2. 3

                                                                        Hopefully, then, someone starts advocating for Runit.

                                                                        I’m surprised none of the BSDs have adopted a daemontools derivative into base yet. Seems like a match made in heaven: a small and carefully written codebase, a focus on simplicity and reliability, directly executable service scripts, no action-at-a-distance pidfile nonsense, sane log rotation. I suppose you can chalk up the inertia to the conservation laws which govern the BSD world. Which, viewed from the systemd angle, look more like a virtue than a shortcoming.

                                                                        1. 1

                                                                          Well, for me, the daemontools everything in a maze of symlinks configuration is really alien. One can argue its merits, but it’s definitely different, and not clear how one evolves from some simple rc scripts to that.

                                                                          1. 3

                                                                            Fair. Maybe there’s a good rc-to-daemontools migration guide, but I usually just write the ./run file from scratch and chuck all the supervision and logging boilerplate in the rc script. You won’t need the vast majority of it.

                                                                            The daemontools everything in a maze of symlinks configuration is really alien.

                                                                            It definitely is at first. The only symlinks you really need are of the form /var/service/foo -> /etc/service/foo. Those are necessary for clean uninstallation of services (see “How do I remove a service”), and that has to do with how svscan and supervise coordinate directory-based operations.

                                                                            Convention recommends some other symlinks though. Some people don’t care, but I like to have foo/supervise, foo/log/supervise, and foo/log/main point back into /var for hier(7) reasons. For services that serve from filesystem state, a foo/root symlink is a good idea, as it lets you do atomic deploys via atomic symlink replacement.

                                                                            1. 1

                                                                              Daemontools is a bit alien, but the logging system is superior to what is there for services to use in openbsd imo.

                                                                        2. 2

                                                                          There is definitely a selection bias here. Here is the first random, closed ticket I opened on their issue tracker: https://github.com/systemd/systemd/issues/5483 I’m not saying they are all good, but they aren’t all bad.

                                                                        1. 1

                                                                          It would be absolutely trivial for Amazon to place EC2 metadata, including IAM credentials, into XenStore; and almost as trivial for EC2 instances to expose XenStore as a filesystem to which standard UNIX permissions could be applied, providing IAM Role credentials with the full range of access control functionality which UNIX affords to files stored on disk. Of course, there is a lot of code out there which relies on fetching EC2 instance metadata over HTTP, and trivial or not it would still take time to write code for pushing EC2 metadata into XenStore and exposing it via a filesystem inside instances.

                                                                          Less radical solution: expose the EC2 instance metadata server on a root-owned 0700 unix domain socket. You wouldn’t get the kind of fine-grained metadata access control that Colin imagines, but your instance could at least use unix groups and permissions to grant access to certain non-root processes.

                                                                          1. 5

                                                                            I’ve been poking around at updating my mail reading setup in a sort of desultory fashion for a while ; I’m 90% of the way there. I’ve been able to discard great gobs of my own horrible elisp customization in favor of mu4e’s context abstraction, and msmtp makes sending a lot cleaner. This is useful because I hate all IMAP clients on OS X, and I can run mu4e inside emacs on my Linux machine and just avoid the whole stinking mess.

                                                                            Amusingly enough, the only way for me to be satisfied with the UI of software on Linux is to run it inside an ancient and crufty lisp development environment.

                                                                            I also have to retube my amp, because I blew a power tube last practice. It hardly seems fair; it’s way too much amp for the practice space, so I play through a powersoak, which means it doesn’t sound as glorious as when it’s driven full-throated. But I still run the tubes at full power, so BLAM, there goes $250. Sigh.

                                                                            For work, wrangling a bunch of scratch tables into a sane, normalized schema that enables actually reasoning about the data, rather than just throwing ad-hoc queries against the wall. The observation that joins can slow queries down has metastasized into a mythology that normalization is bad. I blame the active record pattern, and ActiveRecord in particular, for this criminally negligent cargo-cult belief system. Just because Rails is too dumb stupid to make use of the tools the database provides is no reason to ignore them, particularly when we’re not using an ORM.

                                                                            1. 4

                                                                              Nice. I use mutt + offlineimap + msmtp. Do you have multiple email accounts you need to smtp through? Last week I discovered I was doing multiple msmtp accounts all wrong, and it was sending through the default account every time. You need the from field for envelope-from matching:

                                                                              account foo
                                                                              host foo.example.com
                                                                              port 587
                                                                              auth on
                                                                              from me@foo.example.com
                                                                              user me@foo.example.com
                                                                              

                                                                              This wasn’t an issue for me until recently, when I discovered that sending through the wrong gmail account added the X-Google-DKIM-Signature: header instead of the DKIM-Signature: header, and gmail receivers with DMARC enabled don’t like the former. Also: the report I got by sending mail to mailtest@unlocktheinbox.com was invaluable in tracking down DKIM / DMARC issues.

                                                                              1. 2

                                                                                This is good to know. I have a half-dozen different accounts, so I’ll need to be careful with msmtp. Still, it’s better than having all my various SMTP niblets stuck in various s-expressions scattered about my org-mode startup file. Hat tip for that mailtest business, that’ll help a lot.

                                                                            1. 15

                                                                              So, instead of just collectively grumping about Uncle Bob, let’s pick out a piece of his article that’s actually worth discussing.

                                                                              1. What do you think it would mean to be “professional* in software engineering?

                                                                              2. What do you think we have to do to achieve (1)?

                                                                              3. Do you agree that limiting our tools to reduce churn is a good approach? Why or why not?

                                                                              1. 12

                                                                                I think choosing not to make life difficult for those who come after us is a professional trait. That may include sticking to a reduced, but standardized, tool set.

                                                                                After the development phase, software projects often go into maintenance mode, where a rotating cast of temp contractors is brought in to make necessary tweaks. The time you save by building a gloriously elegant automaton must be weighed against the cumulative time all of them must spend deciphering how the system works.

                                                                                1. [Comment removed by author]

                                                                                  1. 5

                                                                                    It pains me to say this but regulation.

                                                                                    I think not just regulation, but effectively implemented regulation.

                                                                                    I’ve worked in several regulated industries or on systems where there are industry standards like PCI that need to be followed/applied and things are really no better. In fact, regulations can sometimes cause more problems - the rigorous testing/validation requirements mean that once a system is productive, it’s not patched because of the onerous testing requirement (testing that should, ideally, be automated but just isn’t in most organisations).

                                                                                    Yes, that comes down to organisational practises, but we really should be in a better place in 2016 - Sarbanes-Oxley has helped in a lot of areas with things like segregation of duties, proper record keeping, etc, but it’s only a drop in the ocean.

                                                                                  2. 9

                                                                                    Do you agree that limiting our tools to reduce churn is a good approach? Why or why not?

                                                                                    All other things equal, yes. Maciej Ceglowski [0]:

                                                                                    I believe that relying on very basic and well-understood technologies at the architectural level forces you to save all your cleverness and new ideas for the actual app, where it can make a difference to users.

                                                                                    I think many developers (myself included) are easily seduced by new technology and are willing to burn a lot of time rigging it together just for the joy of tinkering. So nowadays we see a lot of fairly uninteresting web apps with very technically sweet implementations. In designing Pinboard, I tried to steer clear of this temptation by picking very familiar, vanilla tools wherever possible so I would have no excuse for architectural wank.

                                                                                    I complain about frontend engineers and their magpie tendencies, but backend engineers have the same affliction, and its name is Architectural Wank. This theme of brutally limiting your solution space for non-core problems is elaborated on further in “Choose Boring Technology” [1]:

                                                                                    Let’s say every company gets about three innovation tokens. You can spend these however you want, but the supply is fixed for a long while. You might get a few more after you achieve a certain level of stability and maturity, but the general tendency is to overestimate the contents of your wallet. Clearly this model is approximate, but I think it helps.

                                                                                    If you choose to write your website in NodeJS, you just spent one of your innovation tokens. If you choose to use MongoDB, you just spent one of your innovation tokens. If you choose to use service discovery tech that’s existed for a year or less, you just spent one of your innovation tokens. If you choose to write your own database, oh god, you’re in trouble.

                                                                                    [0] https://web.archive.org/web/20111228005908/http://www.readwriteweb.com/hack/2011/02/pinboard-creator-maciej-ceglow.php

                                                                                    [1] http://mcfunley.com/choose-boring-technology

                                                                                    1. 2

                                                                                      “All other things equal” is one hell of a caveat, though :)

                                                                                      I’m a huge fan of the healthy skepticism both Dan McKinley and Maciej exhibit when it comes to technology decisions. When something passes the high bar for making a technology change, though, make that change! Inertia is not a strategy.

                                                                                    2. 8

                                                                                      2.a: Take diversity seriously. Don’t act like raging testosterone poisoned homophobic ethnophobic nits just because we’ve been able to get away with it in the past.

                                                                                      2.b: Work to cleanly separate requirements and the best tools to satisfy them in the least amount of time from our desire to play with new toys all the time. 2.c: Stop putting $OTHER language down all the time because we see it as old/lame/too much boilerplate/badly designed. If people are doing real useful work in it, clearly it has value. Full stop.

                                                                                      Those would be a good start.

                                                                                      3: See 2.b - I think saying “Let’s limit our tools” is too broad a statement to be useful. Let’s work to keep our passions within due bounds and try to make cold hard clinical decisions about the tools we use and recognize that if we want to run off and play with FORTH because it’s back in vogue, that’s totally cool (there’s all kinds of evidence that this is a good thing for programmers in any number of ways) but that perhaps writing the next big project at work in it is a mistake.

                                                                                      1. 1

                                                                                        What do you think it would mean to be “professional* in software engineering?

                                                                                        Our stupid divisions and churn come partly from employers and partly from our own crab mentality as engineers.

                                                                                        They come from employers insofar as most people in hiring positions have no idea how to hire engineers nor a good sense of how easily we can pick up new technologies, so they force us into tribal boxes like “data scientist” and “Java programmer”. They force us into identifying with technological choices that ought to be far more prosaic (“I’m a Spaces programmer; fuck all you Tabs mouth-breathers”). This is amplified by our own tribalism as well as our desire to escape our low status in the corporate world coupled with a complete inability to pull it off– that is, crab mentality.

                                                                                        What do you think we have to do to achieve (1)?

                                                                                        I’ve written at length on this and I don’t think my opinions are secret. :)

                                                                                        Do you agree that limiting our tools to reduce churn is a good approach? Why or why not?

                                                                                        I’m getting tired of the faddishness of the industry, but I don’t think that trashing all new ideas just because they’re “churn” is a good idea either. New ideas that are genuinely better should replace the old ones. The problem is that our industry is full of low-skill young programmers and technologies/management styles designed around their limitations, and it’s producing a lot of churn that isn’t progress but just new random junk to learn that really doesn’t add new capabilities for a serious programmer.

                                                                                        1. 1

                                                                                          I’m getting tired of the faddishness of the industry, but I don’t think that trashing all new ideas just because they’re “churn” is a good idea either

                                                                                          I agree, I completely agree. I absolutely understand that it is foolish to adopt new tech before it has developed good tooling (and developed, as someone pointed out in a comments section somewhere, a robust bevy of answers of Stack Overflow). You’re just making your developers' lives harder. Still, trashing new ideas is also silly, for a very good reason.

                                                                                          I think that the argument ignores genuine advances in technology. In the article, Java is likened to a screwdriver. Sure, throwing away a screwdriver for a hammer is nonsensical tribalism, but throwing away a screwdriver for a power drill isn’t. There will be times when I want to explicitly write to buffers – I’ll use C or C++ as needed. But why would I otherwise pick a language that segfaults, when advances in language design and compiler theory have yielded Rust, which may well do the same thing*?

                                                                                          It might cost more in the short term to tear down the wooden bridge and build a concrete bridge. Heck it might cost more in the long term to do so, if concrete is more expensive to maintain (I acknowledge my analogy is getting a tad overwrought.) But aren’t better guarantees about the software you produce worth it?

                                                                                          For the record, I’m not trying to speak as a Rust evangelist here – it’s just a topic I know about that fits the argument. It’s new, it’s still developing its tooling, but it clearly represents progress in programming language theory.

                                                                                          For another example, imagine if the people in the argument used vim. Vim is robust and powerful – but many people consider it a poor choice of tool for Java development. How would I convince this person to switch from vim to IntelliJ. Isn’t IntelliJ just another example of churn? It’s a new shiny tool, right? Thoughtful consideration of new stuff is required to distinguish between “churn” and “hey maybe we can move on from the dark ages.”

                                                                                          I don’t want to be accused of talking past the author. I think that the author would agree with an underlying point – that whichever language, IDE, framework you choose, you should choose with a good understanding of what your tool can do, and what the alternatives are.

                                                                                          *I mean, it might not do the same thing – you might want blazing speed or something else that C provides that Rust does not yet. So, yeah, choose your tools wisely.

                                                                                      1. 1

                                                                                        Be mindful of the context and likely reaction if you do this. I did something similar at a corporate job once and it didn’t go over quite like I’d hoped…

                                                                                        My coworker was a BSD guy who didn’t realize shutdown on Windows nags you after a small delay about uncooperative processes. I noticed after he’d left, so I hit cancel, dropped a text file in his Startup folder, shut down for real, and headed out the door.

                                                                                        The next morning my coworker came in and booted up. Notepad immediately popped up a message saying “You’ve been h4x0r3d: secure your box you fat bastard.” Not knowing Windows and probably distrusting it in the first place, he filed a security ticket. When I rolled in at 10am the sec team had unplugged his workstation and were in the process of carting it off to investigate. Oh the fun conversations we all had together that day.