Threads for andyc

  1. 10

    Oof, accessing out of bounds memory is pretty surprising to me for a dynamic language … But I guess it’s not surprising if your goal is to compile to fast native code (e.g. omit bounds checks).

    I don’t know that much about how Julia works, but I feel like once you go there, you need to have very high test coverage, and also run your tests in a mode that catches all bound errors at runtime. (they don’t have this?)

    Basically it’s negligent not to use ASAN/Valgrind with C/C++ these days. You can shake dozens or hundreds of bugs out of any real codebase that doesn’t use them, guaranteed.

    Similarly if people are just writing “fast” Julia code without good tests (which I’m not sure about but this article seems to imply), then I’d say that’s similarly negligent.


    I’ve also learned the hard way that composability and correctness are very difficult aspects of language design. There is an interesting tradeoff here between code reuse with multiple dispatch / implicit interfaces and correctness. I would say they are solving O(M x N) problems, but that is very difficult, similar how the design of the C++ STL is very difficult and doesn’t compose in certain ways.

    1.  

      I haven’t tested it, but I also wondered “how can this be?” You can launch Julia as julia --check-bounds=yes which should override the @inbounds disabling of bounds checking.

      If that works, that @inbounds bugs of the original article persist for many years in spite of this “flip a switch and find them” probably says the issue is more the “culture”. People often confuse culture and PLs, but it is true that as a consumer (who does not write all their own code) both matter.

      1.  

        Bounds checking is turned on by default when running package tests. The issue is that the bounds are not broken for regular arrays. If some tests were written with OffsetArrays then the errors would have been seen.

        There’s also the context that many of the affected packages were written before Julia supported arrays that are not indexed from 1 and were not updated (to be fair, not that many people use weirdly indexed arrays).

        1.  

          Yeah one thing I would add that’s not quite obvious is that you likely need “redundant” tests for both libraries and APPLICATIONS.

          This is because composing libraries is an application-specific concern, and can be done incorrectly With Julia’s generality and dynamic nature, that concern is magnified.

          Again I’d make an analogy to C++ STL – you can test with one set of template instantiations, e.g. myfunction<int, float>. But that doesn’t mean myfunction<int, int> works at all! Let alone myfunction<TypeMyAppCreatedWhichLibraryAuthorsCannotKnowAbout, int>.

          In C++ it might fail to compile, which is good. But it also might fail at runtime. In a dynamic language you only have the option of failing at runtime.


          I have a habit that I think is not super common – I write tests for the libraries I use, motivated by the particular use cases of my application. I also avoid package managers that pull in transitive dependencies (e.g. npm, Cargo, etc.)

          But yeah it sounds like there is a cultural change needed. I have some direct experience with an open source C maintainer rejecting ASAN changes… simply due to ignorance. It can be hard to change the “memes”.

          So to summarize I would say that in Julia it’s not enough for libraries to run tests with --check-bounds=yes – applications also need tests that run with it. And the tests should hit all the library code paths that the application hits.

          1.  

            Any Julia project (library or application) that has tests and runs them in the standard way will run them with bounds checking turned on.

            The issue was that Yuri was using a type with an indexing scheme the author hadnt expected, and that scenario was not tested.

      1. 8

        It is pretty damming to the Go language that you can’t use any existing code. Just about every other language provides relatively straightforward (if not seamless) interop with any other C ABI. Only in Go have I heard such consistent and negative opinions on the FFI. Java and JNI are close, but it still seems better received and in that case at least there is a decent reason because ruins your portability once you add native code.

        The fact that someone would recomend “reimplementing a large piece of C code in Go” instead of just binding to it is exposing a huge downside of the language.

        1. 4

          The fact that someone would recomend “reimplementing a large piece of C code in Go” instead of just binding to it is exposing a huge downside of the language.

          The main reason is so you “get” effortless portability as a result. I can only think of zig where you get some out-of-the-box portability without re-writing your C in zig (since it ships with libc for various platforms/archs and has all the toolchain nicely setup).

          1. 2

            i immediately thought of zig’s self contained compiler when i saw this post… and i recall things being posted to show how you can integrate zig cc in w/ go/cgo to have portable c compilers

            seems like it would be a good thing for these project maintainers to get on board with…

            1. 6

              I wrote a blog post where I cross-compiled with Zig the CGo SQLite library for all the major OSs without too much fuss.

              https://zig.news/kristoff/building-sqlite-with-cgo-for-every-os-4cic

          2. 3

            I can’t wait for Go to have an FFI some day!

            1. 1

              As mentioned above, I believe this to be simply untrue: Go has an FFI today and it’s called cgo. What is it about cgo that does not make it an FFI?

              1. 1

                cgo is basically a separate language. It is a separate implementation.

                1. 1

                  I can’t see how it’s a separate language. You embed a bit of C code in a special place within a Go file. The C code is compiled by a C compiler, the Go code by a compiler and cgo. And from the C and the Go code, cgo generates some interface code to make C names known to the Go compiler and some Go names known to the C compiler. How is cgo (which, to me, is a program) a separate language?

                  It is a separate implementation.

                  cgo is a separate implementation of what?

            2. 2

              Yes 100%, here is my lament from 4 years ago on that topic.

              https://news.ycombinator.com/item?id=16741043

              A big part of my pain, and the pain I’ve observed in 15 years of industry, is programming language silos. Too much time is spent on “How do I do X in language Y?” rather than just “How do I do X?”

              For example, people want a web socket server, or a syntax highlighting library, in pure Python, or Go, or JavaScript, etc. It’s repetitive and drastically increases the amount of code that has to be maintained, and reduces the overall quality of each solution (e.g. think e-mail parsers, video codecs, spam filters, information retrieval libraries, etc.).

              There’s this tendency of languages to want to be the be-all end-all, i.e. to pretend that they are at the center of the universe. Instead, they should focus on interoperating with other languages (as in the Unix philosophy).

              One reason I left Google over 6 years ago was the constant code churn without user visible progress. Somebody wrote a Google+ rant about how Python services should be rewritten in Go so that IDEs would work better. I posted something like <troll> … Meanwhile other companies are shipping features that users care about </troll>. Google+ itself is probably another example of that inward looking, out of touch view. (which was of course not universal at Google, but definitely there)

              This is one reason I’m working on https://www.oilshell.org – with a focus on INTEROPERABILITY and stable “narrow waists” (as discussed on the blog https://www.oilshell.org/blog/2022/02/diagrams.html )

              (copy of HN comment in response to almost the same observation!)

              I’m also excited about Zig for this reason. e.g. “maintain it with Zig” https://kristoff.it/blog/maintain-it-with-zig/

              1. 1

                On the other hand, oilshell is not(?) compatible with the piles of bash (and sh, and…) scripts out in the world, so folks have to rewrite it to be compatible with your shell. Is this not contradicting what you said earlier?

                1. 2

                  Hm honest question: Why do you think it’s not compatible?

                  It’s actually the opposite – it’s the ONLY alternative shell that’s compatible with POSIX sh and bash. It’s the most bash compatible shell by a mile.

                  Try running osh myscript.sh on your shell scripts and tell me what happens!


                  The slogan on the front page is supposed to emphasize that, but maybe it’s not crystal clear:

                  It’s our upgrade path from bash to a better language and runtime.

                  Also pretty high up on the FAQ is the statement:

                  http://www.oilshell.org/blog/2021/01/why-a-new-shell.html#introduction

                  OSH is a shell implementation that’s part of the Oil project. It’s compatible with both POSIX and bash. The goal is to run existing shell scripts. It’s done so since January 2018, and has matured in many regular releases since then.


                  Nonetheless I think it could be clearer, so I filed a bug to write a project tour and put it prominently on the home page:

                  https://github.com/oilshell/oil/issues/1127

                  It is disappointing to me that this hasn’t been communicated after so many years … I suspect that some people actually think the project is impossible. It’s both compatible AND it’s a new language.

                  I described in the latest blog post how that works:

                  http://www.oilshell.org/blog/2022/05/release-0.10.0.html#backward-compatibility-the-eternal-puzzle

                  Here are all your other alternative shell choices, NONE of which have the compatibility of OSH.

                  https://github.com/oilshell/oil/wiki/Alternative-Shells

                  (That is why the project is so large and long)

                  1. 1

                    Ah, I’m sorry. I skimmed the FAQ but missed that sentence. For some reason, the impression I got from your FAQ is that it’s basically yet another shell that doesn’t offer backwards compatibility. Obviously I was terribly wrong. I’m not sure how to suggest changes that may have prevented that (other than it’s completely my fault for misreading/skimming and getting the wrong impression.) So, sorry for the noise.

                    1.  

                      OK no worries … I think it actually did point out that this crucial fact about the project is somewhat buried. Not entirely buried but “somewhat”.

              2. 1

                Yes and no. I mean there is CGO. Which you can use. While it’s worse in Go, also because of threading, especially on Linux you’ll still find “pure” implementations of things that would usually use a C library, sometimes they are faster because calling the FFI might still be slow. Database interfaces are one such an example, where people sometimes find the bottleneck to be the FFI.

                You also get certain benefits from not using C. I already mentioned the threading part which sometimes bites people in Go, but also you can be sure about memory safety, debugging will be easier, all the people using the project can contribute even when they are not fluent in C, etc.

                And if you still want/need to use C, there is CGO.

                There certainly have been cases in other languages where I wished a library wasn’t just a wrapper around C, be it Python, Java, Objective-C/Swift or in node.js-projects. Given circumstances they can be a source for headaches.

                1. 1

                  It is simply not true that “you can’t use any existing code” in Go. There’s cgo and it allows you to to call into C code and provides way for C code to call into go code - that’s pretty much the definition of using existing code. I think a big reason people are complaining about JNI is the the same for people complaining about cgo: Because you are dealing with garbage collected language, there are rules about what you can do with memory and pointers. The same applies to .NET as well.

                  The fact that someone would recomend “reimplementing a large piece of C code in Go” instead of just binding to it is exposing a huge downside of the language.

                  As the article points out, in the very first sentence, most people use mattn/go-sqlite3 which is in fact a wrapper around the canonical C SQLite implementation. A “decent reason” (your words) to not use that library is because “it ruins your portability” because “you add native code”. This reason is at play here.

                  This being said, the shown port to Go is a respectable effort. While being impressive, I’d probably use one of the bindings to the canonical C code if possible as it uses a highly tested implementation. If not possible the cznic provides an interesting alternative.

                1. 4

                  Does anyone know what part of the book is about language design? I feel I know enough about compiler construction, but I’d like to read more about language design. So far I feel language design is 100% art and 0% science, but I’d be delighted to discover that I’m wrong.

                  1. 3

                    I feel that there is a lot of math & science available to be applied to language design. I’ve been doing a deep dive into this. I don’t really see much, if any, of this math and science reflected in the textbook’s table of contents.

                    It looks to me that this book is stuck in an era where C and Pascal were at the leading edge of programming language design, and where syntax design is the main area where you apply math and science (formal methods) for language design. Eg, creating a BNF grammar for the syntax. I personally feel that syntax design is relatively shallow and trivial, and that the most important areas of language design, where math and formal methods should also be used, are elsewhere.

                    I don’t have a book or website that adequately summarizes this. It’s just that I’ve read hundreds of academic papers on programming language design, full of math, and have been working to pull out the parts that are relevant to my design goals. It’s hard to summarize.

                    1. 3

                      I skimmed over a few chapters (e.g. on semantic analysis) and I don’t think it’s about language design, at least not more than any other compiler book.

                      I think language design is mostly art but there some things people can agree on (although they’ve probably not been written down):

                      • syntax and semantics should correspond, e.g. similar things should look similar; different things should look different
                      • familiar syntax does matter, because applications these days are composed of many languages. e.g. you can be a purist and say = means equality, but if == and === mean equality in the 10 other languages I program in, I will be annoyed with your language
                      • There’s no substitute for writing tons of programs. Or conversely a language designed by someone who has only written a few programs will almost certainly be bad / limited

                      A pretty good paper:

                      Seven Deadly Sins of Introductory Language Design https://users.monash.edu/~damian/papers/PDF/SevenDeadlySins.pdf (although I would say there is a bias toward languages for undergrads in academia, for obvious reasons)


                      Programming Language Explorations is a good book that may spark thoughts on language design. Chapter 14 has an explicit list of language design questions

                      https://www.amazon.com/Programming-Language-Explorations-Ray-Toal/dp/149873846X

                      I recommended it 3 years ago: https://old.reddit.com/r/ProgrammingLanguages/comments/d0o85t/nice_book_for_language_designers_programming/

                    1. 4

                      I think the justification is thoroughly out of date:

                      http://doc.cat-v.org/bell_labs/pikestyle

                      There’s a little dance involving #ifdef’s that can prevent a file being read twice, but it’s usually done wrong in practice - the #ifdef’s are in the file itself, not the file that includes it. The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.

                      Modern preprocessors are extremely fast and optimized (e.g. with respect to string allocation)

                      And even if they weren’t, neither pre-processing or lexing would dominate even a debug build (i.e. where the code gen phase is as fast as possible).

                      Again I agree with another commenter that says this post lacks numbers.


                      I just did a little test of #include <vector> duplicated in multiple headers. Then I use cc -E .. | wc -l to count lines.

                      It doesn’t get duplicated, almost certainly because #pragma once works.

                      And also I did a test of the traditional include guards in the header, and it works fine. The lines do NOT get duplicated and do NOT get passed to the lexer. The file gets opened and preprocessed, so you can technically save that. But again I’d like to see numbers.

                      It’s weird to call it “a little dance involving #ifdefs” when I haven’t seen a codebase in 20 years that doesn’t do that, or something more modern. A lot has changed since this article was written.

                      1. 3

                        You then #include “foo2.h in foo.c and bang! You just included and parsed bar.h twice.

                        This is what #pragma once is supposed to prevent.

                        Old style header guards were a problem since the compiler still needed to read the entirety to look for the tail #endif. It had been such a problem that Lakos in Large Scale C++ Software Design recommended and showed how using redundant header guards improved compilation speed by not opening the file:

                        // foo.h
                        #ifndef FOO_INCLUDED
                        #define FOO_INCLUDED
                        //.... contents ...
                        #endif // FOO_INCLUDED
                        
                        // usage of foo.h
                        #ifndef FOO_INCLUDED
                        #include "foo.h"
                        #endif
                        

                        The more modern solutions seem roughly in order to be:

                        1. Use #pragma once
                        2. Use forward declarations
                        3. Use precompiled headers
                        4. Use Include what you use to track and minimize includes
                        5. PIMPL pattern
                        6. (Only last since requires C++20) Modules
                        1. 1

                          Yeah I would also say the statement is slightly inaccurate. You included it twice which you means you could have pre-processed it twice, but the compiler didn’t PARSE it twice, as long as you have the traditional include guards.

                          I agree with another comment in that I’d like to see some data.

                          I’m working on optimizing the build of oil-native now, moving to Ninja … so if anyone has tips let me know.

                          I think I may count the lines of every pre-processor-expanded translation unit, which is something obvious I’ve never done …

                        1. 2

                          Feels somewhat similar to a test harness I wrote for testing Oil, e.g.

                          https://github.com/oilshell/oil/blob/master/spec/xtrace.test.sh

                          You write some shell, then put STDOUT blocks and STDERR blocks. (I used a little recursive descent parser where the tokens are lines!)

                          After I wrote it I found that other shells have more or less the same thing. I think there was one for the OpenBSD shell in Perl etc.

                          1. 9

                            With Egg expresssions, you must quote characters that aren’t alphanumeric, so the difference between literals and operators is clear.

                            oil$ const pat = / [',' - '.'] /
                            oil$ = pat
                            (Regex)   [,-.]
                            

                            (The = operator shows the value of an expression, which in the regex case is the ERE it compiles to.)

                            oil$ const pat = / [ - '.'] /
                              const pat = / [ - '.'] /
                                              ^
                            [ interactive ]:8: Syntax error in expression (near Id.Arith_Minus)
                            

                            So if you quoted that, then it would work. The same applies outside character classes – CODE and DATA are distinct! This is a big problem with both shell and regex syntax.

                            Egg expressions are also statically parsed like code, not dynamically parsed like data.

                            1. 3

                              Recursive descent is a great technique to learn, but this example recognizes a regular language, which is non-recursive (e.g. $3 or £42).

                              I tend to handle all the non-recursive structure with regular languages, which is much simpler and faster [1]. As a regex it’s basically:

                              [$£€][0-9]+
                              

                              I consider this lexing, which I don’t think is very controversial. And then technique for parsing recursive structure is complementary to lexing, and depends on the language you’re recognizing. You can hand-write it with recursive descent, use a yacc-like LALR(1) parser generator , PEG, etc.

                              More on this argument: Why Lexing and Parsing Should Be Separate

                              [1] https://www.oilshell.org/blog/2020/07/eggex-theory.html – some more arguments here but this post is a bit dense, and not sure most people got it

                              1. 4

                                Today I learned about the “if myfunc” pitfall. I’m pretty fond of shell programming in general, in a punk/lomography kind of way, but that one scares me.

                                1. 2

                                  Yeah it comes up on the bash mailing list with some frequency. I think it’s been true for 30 years or more, and every so often there is a big thread about it. Hence “groundhog day”.

                                  Here’s another one (where I participate but so do many other people):

                                  https://news.ycombinator.com/item?id=27163494

                                  It’s weird shell folk knowledge.

                                  I add this to my scripts to flag it when run under OSH:

                                  shopt -s strict:all 2>/dev/null || true 
                                  

                                  https://github.com/oilshell/oil/blob/master/test/spec.sh#L9

                                  (Right now Oil’s own scripts keep are run under bash as well.)

                                  1. 1

                                    +1 for

                                    punk/lomography

                                    I’ll henceforth use that to explain the essence of shell programming.

                                  1. 3

                                    Kind of related, I’ve been meaning to do some basic experiments/benchmarks to understand when handwritten lexing is faster than regexps (for example, comparing some typical [not necessary complete] email regexp matcher with a handwritten matcher that matches the same strings).

                                    At the highest level I assume that kind of quickly a lexer will be faster than most regexps since my lexer would be a precise function and regexp is a general purpose engine.

                                    But I’m sure there will be surprises.

                                    1. 2

                                      Yes, that’s almost certainly the case, especially given that Go’s regexp package is quite slow (guaranteed linear-time, but slow): https://github.com/golang/go/issues/26623

                                      1. 2

                                        FWIW you can also generate code from regexes – i.e. it can be a regex compiler rather than an interpreter.

                                        For example Oil’s lexer uses re2c to generate C code.

                                        intermediate source: https://www.oilshell.org/release/0.10.0/source-code.wwz/_devbuild/tmp/osh-lex.re2c.h

                                        generated code: https://www.oilshell.org/release/0.10.0/source-code.wwz/_devbuild/gen/osh-lex.h

                                        The generated code is a big state machine with switch and goto, and it doesn’t allocate memory or anything like that. It’s quite fast in my experience.


                                        I meant to write a blog post on re2c, but haven’t gotten around to it. Some pictures here:

                                        https://github.com/oilshell/blog-code/tree/master/fgrep-problem-benchmarks

                                        Another benefit of regular languages is that you get (simulated) nondeterminism FOR FREE. It’s still a linear pass, but it can be in multiple states at once, and then make the decision much later.

                                        So it can be easier to write certain kinds of lexers, and even move work from the parser to the lexer. I wrote about related topics here but didn’t quite emphasize the nondeterminism:

                                        https://www.oilshell.org/blog/2020/07/eggex-theory.html

                                      1. 3

                                        # try sets _status, not $?

                                        why is that?

                                        1. 8

                                          Good question, I will put this in the FAQ !

                                          Basically the idea is that idiomatic Oil scripts never look at $? – it’s only used to trigger the errexit rule.

                                          Instead they invoke try and inspect _status when they want to handle errors


                                          Some detail: try is a builtin that takes a block, and every builtin has an exit code.

                                          The exit status of try is always zero, because if it returned a non-zero status for $?, then the errexit rule would trigger, and you wouldn’t be able to handle the error!

                                          So there are no special cases in the interpreter for try.

                                          For context, other builtins that take a Ruby-like block:

                                          • cd – save and restore current dir
                                          • shopt – save and restore option state
                                          • forkwait – “replacement” for subshell ( cd / ; echo $PWD ) which is rarely used in Oil
                                          • fork – replacement for &

                                          Generally the errors occur INSIDE the block, not outside, like:

                                          cd /tmp {
                                             cp zzzz /   # error happens here
                                             echo done 
                                          }  # not here
                                          

                                          Of course you can nest cd and try if you want.

                                          And the _ convention for _status is for globals / “registers” that the shell itself mutates:

                                          • _status, and the rarely used _pipeline_status, _process_sub_status
                                          • _match(1) for the match of a regular expression
                                          • _this_dir for imports relative to the current directory

                                          Related questions: https://www.oilshell.org/release/0.10.0/doc/error-handling.html#faq-on-language-design (why is there try but no catch)

                                          1. 2

                                            Good answer :)

                                            That’s pretty neat, being able to implement this without special cases.

                                            1. 1

                                              Glad it made sense! A related thing that I think will prove useful is that you don’t have to pass a block literal to try. Instead you can pass a variable like

                                              const myblock = &(cd /tmp; echo $PWD)
                                              try (myblock)   # typed argument to try
                                              

                                              I mentioned this in a thread a few months ago:

                                              https://lobste.rs/s/irvnko/recent_progress_on_oil_shell

                                              I don’t expect this to be common in app code, but I can see it being useful in test frameworks and the like.

                                        1. 2

                                          It seems like more technologies which plug the dynamicity gap on the server-side — languages like PHP — would be valuable. The CGI/PHP deployment and execution model clearly excites people as demonstrated by the popularity of AWS Lambda, yet we still see relatively little new development in this space; the ability to dip in and out of code in web pages to offer mildly dynamic functionality on the server side without frameworks is empowering yet is largely offered by PHP alone

                                          I kind of want Oil to go in this direction, and it is natural IMO because most of https://www.oilshell.org is static content generated by shell invoking tools in Python. It’s written basically like PHP in an “inline” style, although I also like using shell functions.

                                          And the Oil language is roughly a hybrid between shell and Python, so it’s not to hard to imagine it being used dynamically.

                                          CGI and FastCGI are both appropriate. Shell is the language of process-based concurrency!

                                          Oil also fixes all the reasons you wouldn’t want to do this, like the Shellshock-like “hidden evals” that bash still has. (In OSH and Oil shopt --set unsafe_arith_eval is off by default.)

                                          PHP itself started as sort of shell. It has echo and here docs and $var, etc. It was a bunch of CGI scripts glued together.


                                          The thing shell and the web have in common is that they both embrace heterogeneity, and they’re both languages that evolve, e.g.

                                          https://www.oilshell.org/blog/2020/02/good-parts-sketch.html#web-sites-are-naturally-made-with-shell-scripts

                                          http://www.oilshell.org/blog/2021/01/philosophy-design.html#a-slogan-an-evolving-set-of-tools

                                          I think of both HTML and shell as sort of “skeleton languages”

                                          (FWIW I also wrote PHP for the first time last year and definitely appreciate several aspects of it, despite all the flaws.)

                                          1. 3

                                            Question: is becoming a niche programmer possible for someone with little to no experience? Is it a good nontraditional route to enter the industry?

                                            1. 5

                                              I think it depends on the niche? For Clojure, my sense is that you can pick it up with little or no experience, but you have to be intelligent in a particular way

                                              e.g. In my experience probably some smart music majors can pick it up, but others might have problems. That is, people “just looking for jobs” will likely have issues with Clojure. There is a tendency to do more from first principles and not follow canned patterns. Also smaller number of StackOverflow answers

                                              1. 5

                                                Absolutely, my previous job hired lots of interns to work with Clojure who had very little programming experience. The interesting part we found was that people without experience often have an easier time learning Clojure because they don’t have any preconceptions about how code should be written. A few of the students my team hired ended up specializing in Clojure after and haven’t had to work with anything else since.

                                                Since Clojure is niche, companies end up having to train most of their devs, so having familiarity with the language is seen as a big plus. Having some small project on GitHub that you can link to in your resume would go a long way here.

                                                1. 2

                                                  I think all this post shows is that it’s possible to do much better than the mainstream in a niche, but it’s also possible to do a lot worse.

                                                  The thing about a niche is it’s unique, so how can you generalize about it? It completely depends on what niche it is.

                                                1. 17

                                                  On the one hand, I totally get the value of a lack of a build step. Build steps are annoying. On the other hand, authoring directly in HTML is something I am perfectly happy to do as little of as possible. It’s just not a pleasant language to write in for any extended amount of time!

                                                  1. 20

                                                    I’m pretty convinced that Markdown is the local maxima for the “low effort, nice looking content” market.

                                                    1. 10

                                                      Agreed. ASCIIDoc, reStructuredText, LaTeX, and other more-robust-than-Markdown syntaxes all have significantly more power but also require a great deal more from you as a result. For just put words out, Markdown is impressively “good enough”.

                                                      1. 4

                                                        I can never remember Markdown syntax (or any other wiki syntax for that matter), while I’m fairly fluent in HTML, and I’m not even a frontend dev. HTML also has the advantage that if some sort of exotic markup is necessary, you know it’s expressble, given time and effort.

                                                        1. 7

                                                          That’s fine, because Markdown allows embedded HTML [1]

                                                          About the only thing that’s a bit obtuse is the link syntax, and I’ve gladly learned that to not have to manually enclose every damn list with or tags.

                                                          [1] at least Gruber’s OG Markdown allowed it by default, and I recently learned CommonMark has an “unsafe” mode to allow it too.

                                                          1. 11

                                                            The trick to remember how to do links in Markdown is to remember that there are brackets and parentheses involved, then think what syntax would make sense, then do the opposite.

                                                            1. 4

                                                              For reference: a Markdown [link](https://example.com)

                                                              Elaboration on the mnemonic you describe

                                                              I thought like you when I first started learning Markdown:

                                                              • Parentheses () are normal English punctuation, so you would intuitively expect them to surround the text, but they don’t.
                                                              • Square brackets [] are technical symbols, so you would intuitively expect them to surround the URL, but they don’t.

                                                              However, I find “don’t do this” mnemonics easy to accidentally negate, so I don’t recommend trying to remember the order that way.

                                                              Another mnemonic

                                                              I think Markdown’s order of brackets and parentheses is easier to remember once one recognizes the following benefit:

                                                              When you read the first character in […](…), it’s clear that you’re reading a link. ‘[’ is a technical symbol, so you know you’re not reading a parenthetical, which would start with ‘(’. Demonstration:

                                                              In this Markdown, parentheticals (which are everywhere) and
                                                              [links like these](https://example.com) can quickly be told
                                                              apart when reading from left to right.
                                                              
                                                              Why not URL first?

                                                              Since you wrote that Markdown does “the opposite”, I wonder if you also intuitively expect the syntax to put the URL before the text, like in [https://www.mediawiki.org/wiki/Help:Links MediaWiki’s syntax] (actual link: MediaWiki’s syntax). I never found that order intuitive, but I can explain why I prefer text first:

                                                              When trying to read only the text and skip over the URLs, it’s easier to skip URLs if they come between grammatical phrases of the text (here), rather than interrupting a (here) phrase. And links are usually written at the end of phrases, rather than at the beginning.

                                                              1. 2

                                                                Well I’ll be dammed. That completely makes sense.

                                                                I do, however, wonder whether this is a post-hoc rationalization and the real reason for the syntax is much dumber.

                                                              2. 3

                                                                Hah. The mnemonic I use is everyone gets the ) on the end of their wiki URLs fucked up by markdown… because the () goes around the URL. therefore it is s []().

                                                                1. 2

                                                                  This is exactly what I do. Parens are for humans, square brackets are for computers, so obviously it’s the other way around in markdown.

                                                                2. 3

                                                                  A wiki also implies a social contract about editability. If my fellow editors have expressed that they’re uncomfortable with HTML, it’s not very polite of me to use it whenever I find Markdown inconvenient.

                                                                  1. 1

                                                                    Of course. I was replying in context of someone writing for themselves.

                                                                3. 3

                                                                  This is interesting: I’ve heard that same experience report from a number of people over the years so I believe it’s a real phenomenon (the sibling comment about links especially being the most common) but Markdown clicked instantly for me so I always find it a little surprising!

                                                                  I have hypothesized that it’s a case of (a) not doing it in a sustained way, which of course is the baseline, and (b) something like syntactical cross-talk from having multiple markup languages floating around; I took longer to learn Confluence’s wiki markup both because it’s worse than Markdown but also because I already had Markdown, rST, and Textile floating around in my head.

                                                                  I’m curious if either or both of those ring true, or if you think there are other reasons those kinds of markup languages don’t stick for you while HTML has?

                                                                  1. 2

                                                                    I’m not Michiel, but for me, it’s because HTML is consistent (even if it’s tedious). In my opinion, Gruber developed Markdown to make it easier for him to write HTML, and to use conventions that made sense to him for some shortcuts (the fact that you could include HTML in his Markdown says to me that he wasn’t looking to replace HTML). Markdown was to avoid having to type common tags like <P> or <EM>.

                                                                    For years I hand wrote the HTML for my blog (and for the record, I still have to click the “Markdown formatting available” link to see how to make links here). A few years ago I implemented my own markup language [1] that suits me. [2] My entries are still stored as HTML. That is a deliberate decision so I don’t get stuck with a subpar markup syntax I late come to hate. I can change the markup language (I’ve done it a few times already) and if I need to edit past entries, I can deal with the HTML.

                                                                    [1] Sample input file

                                                                    [2] For instance, a section for quoting email, which I do quite often. Or to include pictures in my own particular way. Or tabular data with a very light syntax and some smarts to generate the right class on <TD> elements consisting of numeric data (so they’re flush right). Stuff like that.

                                                                    1. 2

                                                                      Yeah, with markdown, I often accidentally trigger some of its weird syntax. It needs a bunch of arbitrary escapes, whereas HTML you can get away with just using &lt;. Otherwise, it is primarily just those <p> tags that get you; the rest are simple or infrequent enough to not worry about.

                                                                      whereas again, with the markdown, it is too easy to accidentally write something it thinks is syntax and break your whole thing.

                                                                      1. 1

                                                                        Yes, I’ve found that with mine as well.

                                                                      2. 1

                                                                        I don’t mean this as an offense, but I did a quick look at your custom markup sample and I hated pretty much everything about it.

                                                                        Since we’re all commenting under a post from someone that is handwriting HTML, I think it goes without saying that personal preferences can vary enormously.

                                                                        Updated: I don’t hate the tables syntax, and, although I don’t particularly like que quote syntax, having a specific syntax for it is cool and a good idea.

                                                                        1. 1

                                                                          Don’t worry about hating it—even I hate parts of it. It started out as a mash up of Markdown or Org mode. The current version I’m using replaces the #+BEGIN_blah #+END_blah with #+blah #-blah. I’m still working on the quote syntax. But that’s the thing—I can change the syntax of the markup, because I don’t store the posts in said markup format.

                                                                      3. 2

                                                                        You’re absolutely right, and so is spc476; HTML has a regular syntax. Even if I’ve never seen the <aside> tag, I can reason about what it does. Escaping rules are known and well-defined. If you want to read the text, you know you can just ignore anything inside the angular brackets.

                                                                        Quick: in Markdown, if I want to use a backtick in a fixed-width span, do I have to escape it? How about an italic block?

                                                                        This would all be excusable if Markdown was a WYSIWYG plain-text format (as per Gruber’s later rationalisation in the CommonMark debate). Then I could mix Markdown, Mediawiki, rST and email syntax freely, because it’s intended for humans to read, and humans tend to be very flexible.

                                                                        But people do expect to render it to HTML, and then the ambiguities and flexibility become weaknesses, rather than strengths.

                                                                    2. 2

                                                                      ASCIIDoc

                                                                      While I agree about the others, I fairly strongly disagree about AsciiDoc (in asciidoctor dialect). When I converted my blog from md to adoc, the only frequent change was the syntax of links (in adoc, URL goes first). Otherwise, markdown is pretty much valid asciidoc.

                                                                      Going in the opposite direction would be hard though — adoc has a bunch of stuff inexpressible in markdown.

                                                                      I am fairly certain in my opinion that, purely as a language, adoc is far superior for authoring html-shaped documents. But it does have some quality of implementation issues. I am hopeful that, after it gets a standard, things in that front would improve.

                                                                      1. 1

                                                                        That’s helpful feedback! It’s limped with the others in my head because I had such an unhappy time trying to use it when working with a publisher[1] a few years back; it’s possible the problem was the clumsiness of the tools more than the syntax. I’ll have to give it another look at some point!

                                                                        [1] on a contract they ultimately dropped after an editor change, alas

                                                                    3. 4

                                                                      Agree, I’ve been using it a ton since 2016 and it has served me well. I think it’s very “Huffman coded” by people who have written a lot. In other words, the common constructs are short, and the rare constructs are possible with embedded HTML.


                                                                      However I have to add that I started with the original markdown.pl (written ~2004) and it had some serious bugs.

                                                                      Now I’m using the CommonMark reference implementation and it is a lot better.

                                                                      CommonMark is a Useful, High-Quality Project (2018)

                                                                      It has additionally standardized markdown with HTML within markdown, which is useful, e.g.

                                                                      <div class="">
                                                                      
                                                                      this is *markdown*
                                                                      
                                                                      </div>
                                                                      

                                                                      I’ve used both ASCIIDoc and reStructuredText and prefer markdown + embedded HTML.

                                                                      1. 3

                                                                        I tend to agree, but there’s a very sharp usability cliff in Markdown if you go beyond the core syntax. With GitHub-flavoured Markdown, I can specify the language for a code block, but if I write virtual then there’s no consistent syntax to specify that it’s a C++ code snippet and not something else where the word ‘virtual’ is an identifier and not a keyword. I end up falling back to things like liquid or plain HTML. In contrast, in LaTeX I’d write \cxx{virtual} and define a macro elsewhere.

                                                                        I wish Markdown had some kind of generic macro definition syntax like this, which I could use to provide inline domain-specific semantic markup that was easier to type (and use) than <cxx>virtual</cxx> and an XSLT to convert it into <code style="cxx">virtual</code> or whatever.

                                                                        1. 3

                                                                          I agree. What sometimes makes me a bit sad is that markdown had a feature compared to others that you can write it to make a nice looking text document as well that you might just output on the terminal for example.

                                                                          It kind of has that nicely formated plain text email style. Also with the alternative syntax for headings.

                                                                          Yet when looking at READMEs in many projects it is really ugly and hard to read for various reasons.

                                                                          1. 4

                                                                            The biggest contributor there in my experience (and I’m certainly “guilty” here!) is unwrapped lines. That has other upsides in that editing it doesn’t produce horrible diffs when rewrapping, but that in turn highlights how poor most of our code-oriented tools are at working with text. Some people work around the poor diff experience by doing a hard break after every sentence so that diffs are constrained and that makes reading as plain text even worse.

                                                                            A place I do wrap carefully while using Markdown is git commit messages, which are basically a perfect use case for the plain text syntax of Markdown.

                                                                            1. 1

                                                                              I honestly don’t care that much about the diffs? I always wrap at around 88/90 (Python’s black’s default max line length), and diffs be dammed.

                                                                              I also pretty much NEVER have auto wrap enabled, specially for code. I’d rather suffer the horizontal scroll than have the editor lie about where the new lines are

                                                                        2. 4

                                                                          It’s not just that they’re annoying, computing has largely been about coping with annoyances ever since the Amiga became a vintage computer :-). But in the context of maintaining a support site, which is what the article is about, you also have to deal with keeping up with whatever’s building the static websites, the kind of website that easily sits around for like 10-15 years. The technology that powers many popular static site generators today is… remarkably fluid. Unless you want to write your own static site generator using tools you trust to stay sane, there’s a good chance that you’re signing up for a bunch of tech churn that you really don’t want to deal with for a support site.

                                                                          Support sites tend to be built by migrating a bunch of old pages in the first two weeks, writing a bunch of new ones for the first two months, and then infrequently editing existing pages and writing maybe two new pages each year for another fifteen years. With most tools today, after those first two or three honeymoon years, you end up spending more time just keeping the stupid thing buildable than actually writing the support pages.

                                                                          Not that writing HTML is fun, mind you :(.

                                                                          (Please don’t take this as a “back in my day” lament. A static site generator that lasts 10 years is doable today and really not bad at all – how many tools written in 1992 could you still use in 2002, with good results, not as an exercise in retrocomputing? It’s not really a case of “kids these days ruined it” – it’s just time scales are like that ¯\(ツ)/¯ )

                                                                          1. 1

                                                                            Heh. I was using an editor written in 1981 in 2002! [1] But more seriously, I wrote a static site generator in 2002 that I’m still using (I had to update it once in 2009 due to a language change). On the down side, the 22 year old codebase requires the site to be stored in XML, and uses XSLT (via xsltproc) to convert it to HTML. On the plus side, it generates all the cross-site links automatically.

                                                                            [1] Okay, it was to edit text files on MS-DOS/Windows.

                                                                          2. 2

                                                                            I find that writing and edititing XML or HTML isn’t so much of a pain if you use some kind of structural editor. I use tagedit in Emacs along with a few snippets / templates and imo it’s pretty nice once you get used to it.

                                                                          1. 8

                                                                            Nice article!

                                                                            There are now tens of thousands of researchers (at a low-end estimate) taking the basic idea of machine learning and either applying small tweaks to it, or finding new problem domains to apply it to. It can seem like there’s an almost infinite set of small problems to work on in machine learning, which means that no-one needs to do too much thinking before choosing one and charging forwards. It’s easy to dismiss the individual value of most of this work as very low — yet, collectively, it’s clearly pushing this field forwards.

                                                                            I definitely feel this … I happened to work on a machine learning paper in 2016, and recently discovered that it has thousands of citations. I’m not sure it’s particularly valuable (although I can’t say it’s not either!).

                                                                            I was recruited to work on it because of my skills in Python, R, and using big clusters, not because I’m interested in ML.

                                                                            I might be slightly less charitable about it than you … I would say there are literally thousands of “shitty machine learning intern projects” wasting compute power that have been going on for the last ~6 years.

                                                                            I know this because I worked on a shitty machine learning research project as an undergrad over 20 years ago, at Xerox PARC! This cured me of my interest in ML.

                                                                            And I worked adjacent to many such projects in 2016.

                                                                            (Funny thing is that I heard Richard Hipp of sqlite said he worked on NLP as a grad student in the 90’s. He basically thinks it’s BS now, and said as much on a couple occasions.)

                                                                            Here’s another example I sent my mom:

                                                                            https://healthcare-in-europe.com/en/news/machine-learning-for-covid-19-diagnosis-promising-but-still-too-flawed.html

                                                                            This paper evaluated ** 415 ** different machine learning models for detecting COVID from images, and none of them really worked. That’s a crazy amount of resources that amounts to very little IMO.


                                                                            I also agree with this section:

                                                                            For example, an astonishing number of people I’ve come across tout WebAssembly as a solution to their problems (and a wide variety of problems at that!). I see it as a neat solution to a niche problem (running C/C++-ish code in a browser) but I am unable to see it as a general solution to other problems. I hope, however, that other people are right and I’m wrong!

                                                                            And maybe I’ll be less charitable than you again. What I’m seeing with WebAssembly is similar to what people said about the JVM.

                                                                            There is the famous quote that Java would turn Windows into a collection of poorly debugged device drivers or something. Java wanted to be an OS.

                                                                            Likewise I see this same “inner platform effect” tendency with WebAssembly enthusiasts – they want it to be an OS.

                                                                            But the JVM ended up as “just another Unix process” (largely), and I think WebAssembly will have a similar role. It’s important and useful, but it’s not the foundation of all computing.


                                                                            Likewise I think the current branch of ML is a dead end, with too many researchers gathering around it. But of course we will see many other improvements / paradigms in ML in the future, and they will be important.

                                                                            I think it’s just a manifestation of the fact most research amounts to very little, and most impact is achieved by the rarest kind of work.

                                                                            Another clear example from COVID was that Katalin Kariko was essentially pushed down and out of U Penn in the 90’s and early 2000’s. She was one the main inventors of the mRNA vaccines, and probably deserves a Nobel Prize!

                                                                            https://billypenn.com/2020/12/29/university-pennsylvania-covid-vaccine-mrna-kariko-demoted-biontech-pfizer/

                                                                            I guess I wrote this to remind myself and others: following trends can be a waste of time, and ultimately not very impactful ! But trends usually have a core kernel of promise, so distinguishing what’s worthwhile and what’s not is hard.

                                                                            1. 4

                                                                              This is the first time I see an attempt to do this. A robust language with all the constructs and unambiguity syntax of general.purpose programming languages,.but attempting a direct syntax for.command invocation.

                                                                              The documentation is clearly a out this. The choice of inspiration (lua) is a fair one, althou I think something a bit more minimal in terms of delimiters and separators is what people want ultimately. But I understand it’s not easy to strike a balance. And to be fair, the multi line syntax for commands is well thought through and much more convenient than putting a slash in the end of each line.

                                                                              Good luck with the project, but please: put the part about command invocation and redirection at the begining of the documentation. All that walkthrough about language constructs, doesn’t set the language appart from all other languages.

                                                                              Make a cookbook of examples illustrating the killer features.

                                                                              1. 4

                                                                                This is the first time I see an attempt to do this

                                                                                That means I have failed miserably in marketing / PR

                                                                                https://ngs-lang.org/

                                                                                1. 3

                                                                                  There are many shell languages that have features from general purpose languages like Python:

                                                                                  https://github.com/oilshell/oil/wiki/Alternative-Shells

                                                                                  Including Oil: https://www.oilshell.org/release/latest/doc/oil-language-tour.html

                                                                                  And to pick 3 from the “Python-like” section

                                                                                  https://github.com/abs-lang/abs

                                                                                  https://koi-lang.dev/#fn

                                                                                  https://github.com/alexst07/shell-plus-plus

                                                                                  Hush appears to be very similar to these, except it’s more based on Lua. (At a quick glance it appears to be missing some of the good parts of shell – i.e. that it composes with external processes and the file system.)

                                                                                  1. 1

                                                                                    A robust language with all the constructs and unambiguity syntax of general.purpose programming languages,.but attempting a direct syntax for.command invocation.

                                                                                    What about powershell?

                                                                                  1. 6

                                                                                    What baffles me about LSP is that it actually does not deal with the most fundamental aspect of language support: syntax highlighting. There is some work to extend the protocol but last time I checked there were no IDEs that supported this extension. So while your auto-completion may be handled by fancy LSP, your syntax highlighting is likely handled by brain-dead, regex-based Textmate syntax “grammar”.

                                                                                    1. 6

                                                                                      Several editors that are more serious about it are using Tree-sitter, both for syntax highlighting and for structural editing/navigation.

                                                                                      1. 3

                                                                                        Neovim is doing a lot in this direction, with more still to come. Although a lot of this is still marked as experimental, I switched over to only tree-sitter-based highlighting about a month ago.

                                                                                        I found the early presentations about the development of tree-sitter to be exellent.

                                                                                      2. 4

                                                                                        VS Code supports semantic highlighting via language servers. See here: https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide

                                                                                        No idea if any other IDEs have taken this approach yet, though.

                                                                                        1. 3

                                                                                          They added syntax highlighting in the latest version:

                                                                                          https://microsoft.github.io/language-server-protocol/specification#textDocument_semanticTokens

                                                                                          But yeah, it’s surprising that it took so long – it’s a rather fundamental operation, and it stresses server’s architecture.

                                                                                          1. 2

                                                                                            I think that’s because syntax highlighting is often done synchronously in the UI thread, while all the other features are done in the background? It’s generally supposed to be cheap (and approximate).

                                                                                          1. 40

                                                                                            I tried out this language while it was in early development, writing some of the standard library (hash::crc* and unix::tty::*) to test the language. I wrote about this experience, in a somewhat haphazard way. (Note, that blog post is outdated and not all my opinions are the same. I’ll be trying to take a second look at Hare in the coming days.)

                                                                                            In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. I really hate to say this about a project where they authors have put in such a huge amount of effort over the past year or so, but I just don’t see its niche – the lack of generics mean I’d always use Zig or Rust instead of Hare or C. It really looks like Drew looked at Zig, said “too bloated”, and set out to create his own version.

                                                                                            Another thing I find strange: why are you choosing to not support Windows and macOS? Especially since, you know, one of C’s good points is that there’s a compiler for every platform and architecture combination on earth?

                                                                                            That said, this language is still in its infancy, so maybe as time goes and the language finds more users we’ll see more use-cases for Hare.

                                                                                            In any case: good luck, Drew! Cheers!

                                                                                            1. 10

                                                                                              why are you choosing to not support Windows and macOS?

                                                                                              DdV’s answer on HN:

                                                                                              We don’t want to help non-FOSS OSes.

                                                                                              (Paraphrasing a lot, obvs.)

                                                                                              My personal 2c:

                                                                                              Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.

                                                                                              1. 13

                                                                                                Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.

                                                                                                I think that’s the consequence of not planning for Windows support in the first place. Rust’s standard library was built without the assumption of an underlying Unix-like system, and it provides good abstractions as a result.

                                                                                                1. 5

                                                                                                  Amos talks about that here: Go’s file APIs assume a Unix filesystem. Windows support was kludged in later.

                                                                                                2. 5

                                                                                                  Windows and Mac/iOS don’t need help from new languages; it’s rather the other way around. Getting people to try a new language is pretty hard, let alone getting them to build real software in it. If the language deliberately won’t let them target three of the most widely used operating systems, I’d say it’s shooting itself in the foot, if not in the head.

                                                                                                  (There are other seemingly perverse decisions too. 80-character lines and 8-character indentation? Manual allocation with no cleanup beyond a general-purpose “defer” statement? I must not be the target audience for this language, is the nicest response I have.)

                                                                                                  1. 2

                                                                                                    Just for clarity, it’s not my argument. I was just trying to précis DdV’s.

                                                                                                    I am not sure I agree, but then again…

                                                                                                    I am not sure that I see the need for yet another C-replacement. Weren’t Limbo, D, Go, & Rust all attempts at this?

                                                                                                    But that aside: there are a lot of OSes out there that are profoundly un-Unix-like. Windows is actually quite close, compared to, say, Oberon or classic MacOS or Z/OS or OpenVMS or Netware or OS/2 or iTron or OpenGenera or [cont’d p94].

                                                                                                    There is a lot of diversity out there that gets ignored if it doesn’t have millions of users.

                                                                                                    Confining oneself to just OSes in the same immediate family seems reasonable and prudent to me.

                                                                                                3. 10

                                                                                                  My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.

                                                                                                  1. 20

                                                                                                    You could say that generics and macros are complex, relative to the functionality they offer.

                                                                                                    But I would put comptime in a different category – it’s reducing complexity by providing a single, more powerful mechanism. Without something like comptime, IMO static languages lose significant productivity / power compared to a dynamic language.

                                                                                                    You might be thinking about things from the tooling perspective, in which case both features are complex (and probably comptime even more because it’s creating impossible/undecidable problems). But in terms of the language I’d say that there is a big difference between the two.

                                                                                                    I think a language like Hare will end up pushing that complexity out to the tooling. I guess it’s like Go where they have go generate and relatively verbose code.

                                                                                                    1. 3

                                                                                                      Yup, agree that zig-style seamless comptime might be a great user-facing complexity reducer.

                                                                                                      1. 16

                                                                                                        I’m not being Zig-specific when I say that, by definition, comptime cannot introduce user-facing complexity. Unlike other attributes, comptime only exists during a specific phase of compiler execution; it’s not present during runtime. Like a static type declaration, comptime creates a second program executed by the compiler, and this second program does inform the first program’s runtime, but it is handled entirely by the compiler. Unlike a static type declaration, the user uses exactly the same expression language for comptime and runtime.

                                                                                                        If we think of metaprogramming as inherent complexity, rather than incidental complexity, then an optimizing compiler already performs compile-time execution of input programs. What comptime offers is not additional complexity, but additional control over complexity which is already present.

                                                                                                        To put all of this in a non-Zig context, languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming. Some folks argue that this introduces complexity. But the complexity of the Python runtime is there regardless of whether modules get an extra code-execution phase; the extra phase provides expressive power for users, not new complexity.

                                                                                                        1. 8

                                                                                                          Yeah, but I feel like this isn’t what people usually mean when they say some feature “increases complexity.”

                                                                                                          I think they mean something like: Now I must know more to navigate this world. There will be, on average, a wider array of common usage patterns that I will have to understand. You can say that the complexity was already there anyway, but if, in practice, is was usually hidden, and now it’s not, doesn’t that matter?

                                                                                                          then an optimizing compiler already performs compile-time execution of input programs.

                                                                                                          As a concrete example, I don’t have to know about a new keyword or what it means when the optimizing compiler does its thing.

                                                                                                          1. 2

                                                                                                            A case can be made that this definition of complexity is a “good thing” to improve code quality / “matters”:

                                                                                                            Similar arguments can be used for undefined behavior (UB) as it changes how you navigate a language’s world. But for many programmers, it can be usually hidden by code seemingly working in practice (i.e. not hitting race conditions, not hitting unreachable paths for common input, updating compilers, etc.). I’d argue that this still matters (enough to introduce tooling like UBSan, ASan, and TSan at least).

                                                                                                            The UB is already there, both for correct and incorrect programs. Providing tools to interact with it (i.e. __builtin_unreachable -> comptime) as well as explicit ways to do what you want correctly (i.e. __builtin_add_overflow -> comptime specific lang constructs interacted with using normal code e.g. for vs inline for) would still be described as “increases complexity” under this model which is unfortunate.

                                                                                                            1. 1

                                                                                                              The UB is already there, both for correct and incorrect programs.

                                                                                                              Unless one is purposefully using a specific compiler (or set thereof), that actually defines the behaviour the standard didn’t, then the program is incorrect. That it just happens to generate correct object code with this particular version of that particular compiler on those particular platforms is just dumb luck.

                                                                                                              Thus, I’d argue that tools like MSan, ASan, and UBSan don’t introduce any complexity at all. The just revealed the complexity of UB that was already there, and they do so reliably enough that they actually relieve me of some of the mental burden I previously had to shoulder.

                                                                                                          2. 5

                                                                                                            languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming.

                                                                                                            Python doesn’t allow compile-time metaprogramming for any reasonable definition of the word. Everything happens and is introspectable at runtime, which allows you to do similar things, but it’s not compile-time metaprogramming.

                                                                                                            One way to see this is that sys.argv is always available when executing Python code. (Python “compiles” byte code, but that’s an implementation detail unrelated to the semantics of the language.)

                                                                                                            On the other hand, Zig and RPython are staged. There is one stage that does not have access to argv (compile time), and another one that does (runtime).

                                                                                                            Related to the comment about RPython I linked here:

                                                                                                            http://www.oilshell.org/blog/2021/04/build-ci-comments.html

                                                                                                            https://old.reddit.com/r/ProgrammingLanguages/comments/mlflqb/is_this_already_a_thing_interpreter_and_compiler/gtmbno8/

                                                                                                            1. 4

                                                                                                              Yours is a rather unconventional definition of complexity.

                                                                                                              1. 5

                                                                                                                I am following the classic paper, “Out of the Tar Pit”, which in turn follows Brooks. In “Abstractive Power”, Shutt distinguishes complexity from expressiveness and abstractedness while relating all three.

                                                                                                                We could always simply go back to computational complexity, but that doesn’t capture the usage in this thread. Edit for clarity: Computational complexity is a property of problems and algorithms, not a property of languages nor programming systems.

                                                                                                                1. 3

                                                                                                                  Good faith question: I just skimmed the first ~10 pages of “Out of the Tar Pit” again, but was unable to find the definition that you allude to, which would exclude things like the comptime keyword from the meaning of “complexity”. Can you point me to it or otherwise clarify?

                                                                                                                  1. 4

                                                                                                                    Sure. I’m being explicit for posterity, but I’m not trying to be rude in my reply. First, the relevant parts of the paper; then, the relevance to comptime.

                                                                                                                    On p1, complexity is defined as the tendency of “large systems [to be] hard to understand”. Unpacking their em-dash and subjecting “large” to the heap paradox, we might imagine that complexity is the amount of information (bits) required to describe a system in full detail, with larger systems requiring more information. (I don’t really know what “understanding” is, so I’m not quite happy with “hard to understand” as a concrete definition.) Maybe we should call this “Brooks complexity”.

                                                                                                                    On p6, state is a cause of complexity. But comptime does not introduce state compared to an equivalent non-staged approach. On p8, control-flow is a cause of complexity. But comptime does not introduce new control-flow constructs. One could argue that comptime requires extra attention to order of evaluation, but again, an equivalent program would have the same order of evaluation at runtime.

                                                                                                                    On p10, “sheer code volume” is a cause of complexity, and on this point, I fully admit that I was in error; comptime is a long symbol, adding size to source code. In this particular sense, comptime adds Brooks complexity.

                                                                                                                    Finally, on a tangent to the top-level article, p12 explains that “power corrupts”:

                                                                                                                    [I]n the absence of language-enforced guarantees (…) mistakes (and abuses) will happen. This is the reason that garbage collection is good — the power of manual memory management is removed. … The bottom line is that the more powerful a language (i.e. the more that is possible within the language), the harder it is to understand systems constructed in it.

                                                                                                                    comptime and similar metaprogramming tools don’t make anything newly possible. It’s an annotation to the compiler to emit specialized code for the same computational result. As such, they arguably don’t add Brooks complexity. I think that this argument also works for inline, but not for @compileError.

                                                                                                        2. 18

                                                                                                          My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.

                                                                                                          Yeah, I can see that. But under what conditions would I care how small, big, or icecream-covered the compiler is? Building/bootstrapping for a new platform is a one-time thing, but writing code in the language isn’t. I want the language to make it as easy as possible on me when I’m using it, and omitting features that were around since the 1990’s isn’t helping.

                                                                                                          1. 8

                                                                                                            Depends on your values! I personally see how, eg, generics entice users to write overly complicated code which I then have to deal with as a consumer of libraries. I am not sure that not having generics solves this problem, but I am fairly certain that the problem exists, and that some kind of solution would be helpful!

                                                                                                            1. 3

                                                                                                              In some situations, emitted code size matters a lot (and with generics, that can quickly grow out of hand without you realizing it).

                                                                                                              1. 13

                                                                                                                In some situations

                                                                                                                I see what you mean, but I think in those situations it’s not too hard to, you know, refrain from use generics. I see no reason to force all language users to not use that feature. Unless Hare is specifically aiming for that niche, which I don’t think it is.

                                                                                                                1. 4

                                                                                                                  There are very few languages that let you switch between monomorphisation and dynamic dispatch as a compile-time flag, right? So if you have dependencies, you’ve already had the choice forced on you.

                                                                                                                  1. 6

                                                                                                                    If you don’t like how a library is implemented, then don’t use it.

                                                                                                                    1. 2

                                                                                                                      Ah, the illusion of choice.

                                                                                                            2. 10

                                                                                                              Where is the dividing line? What makes functions “not complex” but generics, which are literally functions evaluated at compile time, “complex”?

                                                                                                              1. 14

                                                                                                                I don’t know where the line is, but I am pretty sure that this is past that :D

                                                                                                                https://github.com/diesel-rs/diesel/blob/master/diesel_cli/src/infer_schema_internals/information_schema.rs#L146-L210

                                                                                                                1. 17

                                                                                                                  Sure, that’s complicated. However:

                                                                                                                  1. that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere, and I am not convinced that complexity that is abstracted away and provides value is a bad thing, as much of the “let’s go back to simpler times” discourse seems to imply. I rather someone takes the time to solve something once, than me having to solve it every time, even if with simpler code.

                                                                                                                  2. Is this just complex, or is it actually doing more than the equivalent in other languages? Rust allows for expressing constraints that are not easily (or at all) expressable in other languages, and static types allow for expressing more constraints than dynamic types in general.

                                                                                                                  In sum, I’d reject a pull request with this type of code in an application, but don’t mind it at all in a library.

                                                                                                                  1. 4

                                                                                                                    that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere,

                                                                                                                    I find that’s rarely the case. It’s often possible to tweak the approach to a problem a little bit, in a way that allows you to simply omit huge swaths of complexity.

                                                                                                                    1. 3

                                                                                                                      Possible, yes. Often? Not convinced. Practical? I am willing to bet some money that no.

                                                                                                                      1. 7

                                                                                                                        I’ve done it repeatedly, as well as seeing others do it. Occasionally, though admittedly rarely, reducing the size of the codebase by an order of magnitude while increasing the number of features.

                                                                                                                        There’s a huge amount of code in most systems that’s dedicated to solving optional problems. Usually the unnecessary problems are imposed at the system design level, and changing the way the parts interface internally allows simple reuse of appropriate large-scale building blocks and subsystems, reduces the number of building blocks needed, and drops entire sections of translation and negotiation glue between layers.

                                                                                                                        Complexity rarely needs to be somewhere – and where it does need to be, it’s in often in the ad-hoc, problem-specific data structures that simplify the domain. A good data structure can act as a laplace transform for the entire problem space of a program, even if it takes a few thousand lines to implement. It lets you take the problem, transform it to a space where the problem is easy to solve, and put it back directly.

                                                                                                                  2. 7

                                                                                                                    You can write complex code in any language, with any language feature. The fact that someone has written complex code in Rust with its macros has no bearing on the feature itself.

                                                                                                                    1. 2

                                                                                                                      It’s the Rust culture that encourages things like this, not the fact that Rust has parametric polymorphism.

                                                                                                                      1. 14

                                                                                                                        I am not entirely convinced – to me, it seems there’s a high correlation between languages with parametric polymorphism and languages with culture for high-to-understand abstractions (Rust, C++, Scala, Haskell). Even in Java, parts that touch generics tend to require some mind-bending (producer extends consumer super).

                                                                                                                        I am curious how Go’s generic would turn out to be in practice!

                                                                                                                        1. 8

                                                                                                                          Obligatory reference for this: F# Designer Don Syme on the downsides of type-level programming

                                                                                                                          I don’t want F# to be the kind of language where the most empowered person in the discord chat is the category theorist.

                                                                                                                          It’s a good example of the culture and the language design being related.

                                                                                                                          https://lobste.rs/s/pkmzlu/fsharp_designer_on_downsides_type_level

                                                                                                                          https://old.reddit.com/r/ProgrammingLanguages/comments/placo6/don_syme_explains_the_downsides_of_type_classes/

                                                                                                                          which I linked here: http://www.oilshell.org/blog/2022/03/backlog-arch.html

                                                                                                                2. 3

                                                                                                                  In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. … I’d always use Zig or Rust instead of Hare or C.

                                                                                                                  What if you were on a platform unsupported by LLVM?

                                                                                                                  When I was trying out Plan 9, lack of LLVM support really hurt; a lot of good CLI tools these days are being written in Rust.

                                                                                                                  1. 15

                                                                                                                    Zig has rudimentary plan9 support, including a linker and native codegen (without LLVM). We’ll need more plan9 maintainers to step up if this is to become a robust target, but the groundwork has been laid.

                                                                                                                    Additionally, Zig has a C backend for those targets that only ship a proprietary C compiler fork and do not publish ISA details.

                                                                                                                    Finally, Zig has the ambitions to become the project that is forked and used as the proprietary compiler for esoteric systems. Although of course we would prefer for businesses to make their ISAs open source and publicly documented instead. Nevertheless, Zig’s MIT license does allow this use case.

                                                                                                                    1. 2

                                                                                                                      I’ll be damned! That’s super impressive. I’ll look into Zig some more next time I’m on Plan 9.

                                                                                                                    2. 5

                                                                                                                      I think that implies that your platform is essentially dead ( I would like to program my Amiga in Rust or Swift or Zig, too) or so off-mainstream (MVS comes to mind) that those tools wouldn’t serve any purpose anyway because they’re too alien).

                                                                                                                      1. 5

                                                                                                                        Amiga in Rust or Swift or Zig, too)

                                                                                                                        Good news: LLVM does support 68k, in part to many communities like the Amiga community. LLVM doesn’t like to include stuff unless there’s a sufficient maintainer base, so…

                                                                                                                        MVS comes to mind

                                                                                                                        Bad news: LLVM does support S/390. No idea if it’s just Linux or includes MVS.

                                                                                                                        1. 1

                                                                                                                          Good news: LLVM does support 68k Unfortunately, that doesn’t by itself mean that compilers (apart from clang) get ported, or that the platform gets added as part of a target triple. For instance, Plan 9 runs on platforms with LLVM support, yet isn’t supported by LLVM.

                                                                                                                          Bad news: LLVM does support S/390. I should have written VMS instead.

                                                                                                                          1. 1
                                                                                                                        2. 2

                                                                                                                          I won’t disagree with describing Plan 9 as off-mainstream ;) But I’d still like a console-based Signal client for that OS, and the best (only?) one I’ve found is written in Rust.

                                                                                                                    1. 2

                                                                                                                      Hm nice post, I agree it’s interesting how the LSP has had pretty big effects on the ecosystem (although I personally don’t want to use VS Code – pretty sure it’s had telemetry enabled by default from day 1).

                                                                                                                      I would frame it as Microsoft creating a new and successful “narrow waist”. It’s obviously beneficial to bootstrap a narrow waist with some kind of monopoly advantage – a huge piece of software like VSCode, or in WebAssembly’s case getting it in 3 browsers.

                                                                                                                      I mention them both briefly here: A Sketch of the Biggest Idea in Software Architecture

                                                                                                                      That post also alludes to the “lowest common denominator” / compromise problem of narrow waists – you can be stuck with the intersection of language features, not the union. Protobufs also have this issue.


                                                                                                                      As for the M x N argument, I’m not sure it’s as black and white as you seem to imply. I agree with some comments on Hacker News which back that up – e.g. I definitely knew people who swore by Eclipse and Netbeans, and Visual Studio 6 and .NET were extremely popular and relatively good IDEs. They had very good debugging support.

                                                                                                                      https://news.ycombinator.com/item?id=31151048

                                                                                                                      I was confused by some of the claims in the middle about duplicated implementations (perhaps because I don’t really know how VSCode works). I don’t think that invalidates the M x N argument.

                                                                                                                      You can still have a small amount of O(M x N) glue (duplicated LSP protocol implementations, some duplication in plugins) but the point is that you don’t want to duplicate compilers, yes ? You don’t want to write a Java compiler for every IDE!

                                                                                                                      As far as I know Eclipse and IntelliJ used completely different compilers for their IDE support. (And NetBeans too?)


                                                                                                                      OK I think it relates to this comment on HN:

                                                                                                                      The point is that you don’t need common protocol. A bunch of language-specific protocols would have worked! It’s interesting to ask why that didn’t happen

                                                                                                                      I don’t know the details well enough, but I would guess that you can get some decent baseline support by implementing the common protocol, and then if you want more features, use all the bells and whistles of a particular language and of a particular editor?

                                                                                                                      I think that is a pretty common compromise in the “narrow waist” architectures. You have a lowest common denominator and then some extensions.

                                                                                                                      https://news.ycombinator.com/item?id=31152466 (It does seem like there is a fair bit of disagreement on this point)

                                                                                                                      1. 3

                                                                                                                        The “single binary” blog thing has mystified me. (I saw a couple recent posts but didn’t comment.)

                                                                                                                        I guess this is because rsync has pitfalls? The trailing slash issue was mentioned recently and that is indeed annoying.

                                                                                                                        Otherwise there’s no difference between a single file and a tree of files. And in fact the tree of files is better because you can deploy incrementally. This matters for a site like https://www.oilshell.org (which is pretty big now).

                                                                                                                        When I update a single blog post it takes about 1 second to deploy because of rsync (and 1 second to rebuild the site because of GNU make, which I want to replace with Ninja)

                                                                                                                        Using git would also be better than copying the entire blog, since it implicitly has differential compression.


                                                                                                                        Another thing I’m mystified by is people writing their own web servers, setting up their own nginx, managing certificates, etc.

                                                                                                                        I just used shared hosting (Dreamhost, and I tried Nearly Free Speech, but didn’t think it was as good). But you can also use Github pages?

                                                                                                                        I think shared hosting dropped the ball a lot in terms of marketing and user friendliness. They didn’t figure out a good deployment and tooling story. They left everyone alone with shell and I guess most people are not fluent with shell. To be honest I wasn’t as fluent with it when I first got the account! (way back in 2009)

                                                                                                                        But now I think it’s by far the best solution because it gives me a standard / commodity interface. I can use git / rsync / scp, and I don’t have to manage any servers. It’s “serverless”.

                                                                                                                        Maybe another problem is that shared hosting was associated with PHP. I think it fell out of favor because it couldn’t run Rails or Django well.

                                                                                                                        But then people started writing static site generators, and doing more in client side JavaScript, and it perfect for those kinds of sites. So IMO it’s a more stable Netlify or Github pages.

                                                                                                                        I’ve heard people complain that with Github pages you are never sure when the changes will be visible. Did you do something wrong, or is the site just slow? Not a problem when you have shell access.

                                                                                                                        I guess the other downside is that it costs some money, like $5 or $10/month, but I view that as a positive because the business model is self-sustaining. Dreamhost has been stable since 2009 for me. There are many fewer dark patterns. The only annoyance is that they advertise a lower rate first year rate for domains, so the second year you’re surprised. But pretty much everyone does that now :-/


                                                                                                                        Another trick I use is to bundle 10,000 static files together in a zip file and serve it with a FastCGI script .wwz

                                                                                                                        https://github.com/oilshell/wwz

                                                                                                                        This is basically for immutable archives and cuts down on the metadata. I have automated scripts that generate 10,000 files and rsync will slow down on all the stats(), so I just put them in a big zip file. And again the nice thing is that I am not maintaining any web servers or kernels.

                                                                                                                        1. 4

                                                                                                                          Some people prefer to use things they understand, and part of that might be standing up a web server for self-hosting. It’s not rocket science to configure nginx + Let’s Encrypt… And part of it, I suspect, is that shared hosting sites come and go, so not having to be reliant upon a host which may be out of business tomorrow is also a benefit.

                                                                                                                          I had mostly negative experiences using early shared hosting interfaces (ugh cpanel..), wordpress was a little better (and you could self host it) but it is waaaay overkill for simple static sites. Not to mention it’s a beast to set up.

                                                                                                                          Of course there are risks and additional costs with self-hosting stuff. But I’d expect the person who created a new shell to understand the trade-offs/benefits, generally speaking, between using what already exists and making something brand new :D

                                                                                                                          1. 3

                                                                                                                            Yeah I think part of it is that early shared hosting sites did suck.

                                                                                                                            I remember I used 1and1 and it was full of sharp edges, and the control panel was also confusing.

                                                                                                                            Specifically I think Dreamhost is quite good. I have been using it since 2009. Nearly Free Speech is pretty good, but it seems to use some kind of network-attached disk which makes it slow in my limited experience. (Also, they don’t seem to advertise/emphasize that the shell account is BSD! Important for a Linux user.)

                                                                                                                            I also maintain nginx and LetsEncrypt on a VPS for a dynamic site I’ve been using for over a decade. (Used to be Apache and a certificate I bought, etc.)

                                                                                                                            I think shared hosting is by far superior, although as mentioned I recognize there are a few sharp edges and non-obvious choices. I’m not responsible for the kernel either.

                                                                                                                            I would call using shared hosting “using what already exists” … And standing up your own VPS as “making something new”. It will be your own special snowflake :)


                                                                                                                            One interesting thing is that nginx doesn’t actually support multi-tenant sites with separate user dirs and .htaccess. So all shared hosting uses Apache as far as I know. That isn’t a problem since I never touch Apache or nginx on shared hosting – I just drop files in a directroy and it’s done.

                                                                                                                            But I would say there’s strictly less to understand in shared hosting. I rarely log into the control panel. I just drop files in a directory and that’s about it. My interface is ssh / rsync / git. Occassionally I do have to look up how to change an .htaccess, but that’s been maybe 2-3 times in ~12 years.

                                                                                                                          2. 2

                                                                                                                            Another thing I’m mystified by is people writing their own web servers, setting up their own nginx, managing certificates, etc.

                                                                                                                            I can give you my reason for this one. I run my site and other stuff on a simple vps with nginx. That was not my first ides for a solution, but I discovered that there are hardly any offerings for what I want and what I want is dead simple: I have a bunch of html pages and I want them to be served on my domain. You can not do this with the well known platforms. They all require you to put the files in a git repository.

                                                                                                                            I am not going to set up some scripts that automatically commit new versions to a git repository and then push that. There is no need to, the html gets generated from other projects that are already version managed. sudo apt install nginx actually the easiest solution here. I want to be able to do one rsync command because that is all what is needed.

                                                                                                                            If you look a bit further, there are some companies that offer this, but the costs are always equal or higher than renting a vps and they will always limit you flexibilty. There are probably some giant tech companies that have a way of getting this done for free or for very cheap, but it will inevitably involve accounts, convoluted dashboards/configuration and paying constant attention not to accidentally use a resource that costs a lot of money.

                                                                                                                            Perhaps it sounds complicated for someone who has never seen a shell, but managing a site with nginx, certbot and rsync is to about as simple as you can get for this.

                                                                                                                            1. 1

                                                                                                                              Paying $5/mo for a VPS has improved my computing QoL immensely.

                                                                                                                              • It hosts my sites
                                                                                                                              • I use it to host tmux to have IRC access
                                                                                                                              • I can run random net stuff from it

                                                                                                                              All in all a great tool and service.

                                                                                                                            2. 2

                                                                                                                              I guess this is because rsync has pitfalls?

                                                                                                                              it’s pretty simple for me: i don’t like using rsync. i don’t like the awkward syntax (trailing slash or no?), i don’t like managing files, putting my SSH keys everywhere, etc. i understand rsync just fine - i use it professionally.

                                                                                                                              but when it comes to my free time, i optimize for maintainability, reliability, and fun. i can’t explain the titillating feeling i get when i see a binary compile & launch my entire contained website - and i don’t have to :D it’s just good subjective fun.

                                                                                                                              it’s nice to look at my website’s code and know that it’s self-contained and doesn’t rely on this-being-here or that-being-there. it launches exactly the same way on my local machine as it does on my remote machine. also, my website isn’t just about writing posts - it’s also about providing tools. for example, i include an rss reader, an rss fetcher, an age decryption utility, an IP fetcher, etc - all without any javascript! and it’s all right there & testable locally! it’s fun to treat my website like an extension of my ideas instead of “yet another blog” - and my one-binary approach really lends itself to that.

                                                                                                                              I can use git / rsync / scp, and I don’t have to manage any servers.

                                                                                                                              very fair. i have a server sitting around that i like poking once in awhile

                                                                                                                              But you can also use Github pages?

                                                                                                                              in my blog post, i have a section dedicated to why i don’t want to use some-other-hosting-platform for my personal website. the idea that my website is almost entirely under my own control is very important to me. eventually, i plan on porting it to a small SBC running in a van! a future post will describe this process :D

                                                                                                                              not knocking anyone who chooses something like github pages btw, it’s just not for everyone - it depends on the persons values.

                                                                                                                              i hope this helps make things less mystifying.

                                                                                                                              1. 2

                                                                                                                                trailing slash or no?

                                                                                                                                You should always use trailing slashes with rsync, that’s it. rsync -a src/ dst/

                                                                                                                                1. 2

                                                                                                                                  i think that this is misleading given that trailing slashes on the source can change rsync’s behavior

                                                                                                                                  A trailing slash on a source path means “copy the contents of this directory”. Without a trailing slash it means “copy the directory”.

                                                                                                                                2. 1

                                                                                                                                  Yes this makes sense for the dynamic parts of the site. I can see why you would want to have it all in a single language and binary.

                                                                                                                                  I’m more skeptical about the static content, but it all depends on what your site does. If you don’t have much static content then the things I mentioned won’t come up … They may come up in later in which case it’s easy enough to point the static binary at a directory tree.

                                                                                                                                  (Although it looks like Go has some issues there; Python does too. That’s one reason I like “borrowing” someone else’s Apache and using it directly! It’s very time-tested.)

                                                                                                                                  1. 1

                                                                                                                                    totally! if i had like 10k pages of content i’m sure things would look different. :3

                                                                                                                                3. 2

                                                                                                                                  Why? Because Outsource All The Things! limits what you can actually do. I wrote my own gopher server [1][2] because the ones that exist don’t do what I want (mainly, to serve up my blog via gopher). And while I didn’t write my own web server, I do run Apache, because I then have access to configure it as I want [3] and by now, you can pretty much do anything with it [4]. For instance my blog. I have several methods of updating it. Yes, I have the obligatory web interface, but I’ve only used that about a handful of times over the past two decades. I can add an entry directly (as a file), but because I also run my own email server (yes, I’m crazy like that) I can also add blog entries via email (my preferred method).

                                                                                                                                  Is it easy to move? Eh … the data storage is just the file system—the hard part is getting the web server configured and my blogging engine [5] compiled and running. But I like the control I have.

                                                                                                                                  Now, I’m completely mystified and baffled as to why anyone in their right mind would write a program in shell. To me, that’s just insane.

                                                                                                                                  [1] gopher://gopher.conman.org/

                                                                                                                                  [2] https://github.com/spc476/port70

                                                                                                                                  [3] You would never know that my blog is CGI based.

                                                                                                                                  [4] Even if it means using a custom written Apache module, which I did back in the day: https://github.com/spc476/mod_litbook, still available at http://literature.conman.org/bible/.

                                                                                                                                  1. 1

                                                                                                                                    I find the limitation in outsourcing to often be more about learning than features per se. I could well write a gopher server myself, but I’d expect it to be worse than existing gopher servers. Still a worthwhile endeavor, if I didn’t know how to write servers.

                                                                                                                                    In a similar vein, I created a wiki engine that works over gemini and is edited with sed commands. It’s probably the worst wiki UX in existence, but last I checked it was the only wiki running natively on gemini (only). I learned a lot in the process, and some other people also found the approach interesting (even if not useful). Maybe it even inspired some other gemini software.

                                                                                                                                    So yes, there are many good reasons for writing software even when the problem has already been solved.

                                                                                                                                    1. 1

                                                                                                                                      I think either extreme is bad – outsourcing everything or writing everything yourself.

                                                                                                                                      Also not all “outsourcing” is the same. I prefer to interface with others using standard interfaces like a www/ dir, rather than vendor-specific interfaces (which are subject to churn and lock-in).

                                                                                                                                      Shell helps with that – it’s reuse on a coarse-grained level. I am OK with dependencies as long as I can hide them and contain them in a separate process :)

                                                                                                                                      I hinted at similar ideas on the blog:

                                                                                                                                      http://www.oilshell.org/blog/2021/07/cloud-review.html

                                                                                                                                      More possible concept names: Distributed Shell Scripts and Parasitic Shell Scripts. These names are meant to get at the idea that shell can be an independent control plane, while cloud providers are the dumb data plane.

                                                                                                                                      A prominent example is that we reuse multiple CI providers but they are used as “dumb” resources. We have shell to paper over the differences and make programs that are portable between clouds.

                                                                                                                                      This also relates to my recent “narrow waist” posts – a static website is a narrow waist because it can be produced by N different tools (my Python+shell scripts, Jekyll, Hugo, etc.) and it is understood by multiple cloud providers (shared hosting, github pages, even your browser).

                                                                                                                                      So it’s a good interface to serve as the boundary between code you own and code that other people own.

                                                                                                                                      Here is a comment thread with some more links (login required):

                                                                                                                                      https://oilshell.zulipchat.com/#narrow/stream/266575-blog-ideas/topic/Comments.3A.20FastCGI.20and.20shared.20hosting