Threads for gamache

  1. 7

    I agree that José may have made the wrong call here, making parens an optional part of function application. Ruby damage.

    But I disagree that it’s much of a problem in practice:

    • the compiler is more than happy to issue a warning about implicit parens, which this blog post skips over
    • application of a function that is the value of a variable requires a dot, e.g., myfun.() instead of myfun or myfun(), which this blog post does not mention
    • Resolving the question “am I referring to the function or local value?” is a matter of checking what’s in scope, kind of like one would expect in a modern, lexically scoped language
    1. 2

      The thing is that parens-less calls are required for “syntax purposes”, as defining functions as:

      def(x(), do: x)
      

      Isn’t the most pleasant syntax out there. Then we would end with parentheses everywhere and it would look like M-expressions.

      1. 2

        That’s a good point. The other way to provide this would be something like reader macros, and I do not find myself wishing we had those around.

    1. 11

      There’s no overhead compared to calling regular Erlang code, and the generated APIs are idiomatic Erlang.

      Any Gleam packages published to Hex (the BEAM ecosystem package manager) now includes the project code compiled to Erlang. Once the various build tools of the other languages have been updated they will be able to depend on packages written in Gleam without having the Gleam compiler installed.

      We want to fit into the wider ecosystem so smoothly that Erlang and Elixir programmers might not even know they’re using Gleam at all! All they need to know is that the library they are using is rock-solid and doesn’t fail on them.

      This is awesome! It’s really a shame that Elixir never did this. All the Elixir libraries are really painful to use in Erlang projects. I intentionally use Erlang for the open source libraries I’ve created so they are usable from all languages built on the VM.

      1. 5

        You feel exactly the same way I do! There’s so much good stuff in Elixir I wish the other languages could use.

        1. 1

          What are some other notable languages on the BEAM runtime? I’m only aware of Elixir, Erlang, and Gleam.

          1. 2

            Elixir and Erlang are the big ones but the other ones that get used in production that I’m aware of are LFE, Purerl, Hamler, and Luerl

            1. 1

              I’ll have to look into those languages. Thanks!

        2. 2

          It’s really a shame that Elixir never did this. All the Elixir libraries are really painful to use in Erlang projects.

          Aside from needing an Elixir compiler in all cases, and not being able to use Elixir macros (not Elixir’s fault!), where is the pain point? I am on the Elixir side 99.9% of the time so I don’t see it.

          1. 6

            There’s a few things:

            • The Elixir compiler and standard library need to be installed, but they’re not on Hex unlike everything else, you don’t get declarative version resolution when installing them, and they increase the bytecode size quite a lot.
            • The Elixir compilation process has a final step that needs to run last in order to optimise protocols, so you’ll need to wrap or modify your build tool in some way to do this.
            • The Elixir module names and some function names use characters that require quoting in atoms so the function calls look rather strange 'Elixir.Project.Module':'active?'(X).
            • Many APIs use macros when they could be functional, making them unsable from other languages.

            These are not huge barriers, but if you use a library written in Erlang (or in future Gleam) you don’t have to worry about any of this. You add it to your list of deps and that’s it.

            1. 4

              I think you missed my point. It’s troublesome when you want to use Elixir libraries from Erlang, or from any other language on the Erlang VM. The overhead and hassle of retrofitting an existing project with support for Elixir compilation makes it not worth the effort. Thus I stick with Erlang because I need to write code that can be used from any language on the VM.

              1. 3

                It could be (and probably is) pretty easy to use one of the existing rebar3 plugins for using Elixir from a Erlang project. Take https://github.com/barrel-db/rebar3_elixir_compile as an example integration. I haven’t used it, but it looks pretty straight forward. I still wonder how any configuration would be wired together, but that should be pretty reasonable to write if needed.

                Beyond the build system, I imagine Elixir code doesn’t feel ergonomic in other languages. It’s function argument ordering differs from a lot of functional languages, the pervasive use of structs in library APIs, the use of capitalization for modules doesn’t read well, and the constant prefix of 'Elixir.AppModue':some_function(). all make it less appealing.

                All together it’s just enough hassle to use outside of Elixir, and often that bit of work + usability issues isn’t worth it.

            1. 5

              The post claims this is null-terminated: var s = [_]u8{'h', 'e', 'l', 'l', 'o'};

              Is there a missing '\0' at the end, or is Zig doing something clever?

              1. 4

                That was a mistake, string literals are null-terminated but string slices are not.

              1. 16

                Part of this is ‘computers are fast now’. I distinctly remember two such moments around 10-15 years ago:

                The first was when I was working on a Smalltalk compiler. I had an interpreter that I mostly used for debugging, which was an incredibly naïve AST walker and was a couple of orders of magnitude slower than the compiler. When I was doing some app development work in Smalltalk, I accidentally updated the LLVM .so to an incompatible version and didn’t notice that this meant that the compiler wasn’t running for two weeks - the slow and crappy interpreter was sufficiently fast that it had no impact on the perceived performance of a GUI application.

                The second one was when I was writing my second book and decided to do the ePub generation myself (the company that the publisher outsourced it to for my first book did a really bad job). I wrote in semantic markup that I then implemented in LaTeX macros for the PDF version (eBook and camera-ready print versions). I wrote a parser for the same markup and all of the cross-referencing and so on and XHTML emission logic in idiomatic Objective-C (every text range was a heap-allocated object with a heap-allocated dictionary of properties and I built a DOM-like structure and then manipulated it) with a goal of optimising it later. It took over a minute for pdflatex to compile the book. It took under 250ms for my code to run on the same machine and most of that was the process creation / dynamic linking time. Oh, and that was at -O0.

                The other part is the user friendliness of the programming languages. I’m somewhat mixed on this. I don’t know anything about the COCO2’s dialect of BASIC, the BASIC that I used most at that time was BBC BASIC. This included full support for structured programming, a decent set of graphics primitives (vector drawing and also a teletext mode for rich text applications), an integrated assembler that was enough to write a JIT compiler, full support for structured programming, and so on. Writing a simple program was much easier than in almost any modern environment and I hit limitations of the hardware long before I hit limitations of the language. I don’t think I can say that about any computer / language that I’ve used since outside of the embedded space.

                1. 6

                  Hm interesting examples. Though I would say Objective C is screamingly fast compared to common languages like Python, JavaScript (even JITted), and Ruby, even if it’s idiomatic to do a lot of heap allocations.

                  Though another “computers are fast” moment I had is that sourcehut is super fast even though it’s written entirely in Python:

                  https://forgeperf.org/

                  It’s basically written like Google from 2005 (which was crazy fast, unlike now). Even though Google from 2005 was written in C++, it doesn’t matter, because the slow parts all happen in the browser (fetching resources, page reflows, JS garbage collection, etc.)

                  https://news.ycombinator.com/item?id=29706150

                  Computers are fast, but “software is a gas; it expands to fill its container… “


                  Another example I had was running Windows XP in a VM on a 2016 Macbook Air. It flies and runs in 128 MB or 256 MB of RAM! And it has a ton of functionality.


                  This also reminds me of bash vs. Oil, because the bash codebase was started in 1987! And they are written in completely different styles.

                  Oil’s Parser is 160x to 200x Faster Than It Was 2 Years Ago

                  Some thoughts:

                  • Python is too slow for sure. It seems obvious now, but I wasn’t entirely sure when I started, since I could tell the bash codebase was very suboptimal (and later I discovered zsh is even slower). The parser in Python is something like 30-50x slower than bash’s parser in C (although it uses a Clang-like “lossless syntax tree”, for better error messages, which bash doesn’t have. It also parses more in a single pass.)
                  • However adding static types, and then naively translating the Python code to C++ actually produces something competitive with bash’s C implementation! (It was slightly faster when I wrote that blog post, is slightly slower now, and I expect it to be faster in the long run, since it’s hilariously unoptimized)

                  I think it is somewhat surprising that if you take some Python code like:

                  for ch in s:
                    if s == '\n':
                      pass
                  

                  And translating it naively to C++ (from memory, not exactly accurate)

                  for (it = StrIter(s); ch = it.Next(); !it.Done()) {
                    if str_equals(ch, str1) {  // implemented with memcmp()
                       ;
                    }
                  }
                  

                  It ends up roughly as as fast. That creates a heap allocated string for every character like Python does.

                  Although I am purposely choosing the worst case here. I consciously avoided that “allocation per character” pattern in any place I thought would matter, but it does appear at least a couple times in the code. (It will be removed, but only in the order that it actually shows up in profiles !)

                  I guess the point is that there are many more allocations. Although I wrote the Cheney garbage collector in part because allocation is just bumping a pointer.

                  The garbage collector isn’t hooked up yet, and I suspect it will be slow on >100 MB heaps, but I think the average case for a shell heap size is more like 10 MB.


                  I think the way I would summarize this is:

                  • Some old C code is quite fast and optimized. Surprisingly, Windows XP is an example of this, even though we used to make fun of Microsoft for making bloated code.
                    • bash’s code is probably 10x worse than optimal, because Oil can match it with a higher level language with less control. (e.g. all strings are values, not buffers)
                  • Python can be very fast for sourcehut because web apps are mostly glue and I/O. It’s not fast for Oil’s parser because that problem is more memory intensive, and parsing creates lots of tiny objects (the lossless syntax tree).
                  1. 5

                    Though I would say Objective C is screamingly fast compared to common languages like Python, JavaScript (even JITted), and Ruby, even if it’s idiomatic to do a lot of heap allocations.

                    Yes and no. Objective-C is really two languages, C (or C++ for Objective-C++) and Smalltalk. The C/C++ implementation is as good as gcc or clang’s C/C++ implementation. The Smalltalk part is much worse than a vaguely modern Smalltalk (none of the nice things that a JIT does, such as inline caching or run-time-type-directed specialisation). The code that I wrote was almost entirely in the Smalltalk-like subset. If I’d done it in JavaScript, most of the thing that were dynamic message sends in Objective-C would have been direct dispatch, possibly even inlined, in the JIT’d JavaScript code.

                    I used NSNumber objects for line numbers, for example, not a C integer type. OpenStep’s string objects have some fast paths to amortise the cost of dynamic dispatch by accessing a range of characters at once. I didn’t use any of these, and so each character lookup did return a unichar (so a primitive type, unlike your Python / C++ example) but involved multiple message sends to different objects, probably adding up to hundreds of retired instructions.

                    All of these were things I planned on optimising after I did some profiling and found the slow bits. I never needed to.

                    Actually, that’s not quite true. The first time I ran it, I think it used a couple of hundred GiBs of RAM. I found one loop that was generating a lot of short-lived objects on each iteration and stuck an autorelease pool in there, which reduced the peak RSS by over 90%.

                    bash’s code is probably 10x worse than optimal, because Oil can match it with a higher level language with less control. (e.g. all strings are values, not buffers)

                    I suspect that part of this is due to older code optimising for memory usage rather than speed. If bash (or TeX) used a slow algorithm, things take longer. If they used a more memory-intensive algorithm, then the process exhausts memory and is killed. I think bash was originally written for systems with around 4 MiB of RAM, which would have run multiple bash instances and where bash was mostly expected to run in the background while other things ran, so probably had to fit in 64 KiB of RAM, probably 32 KiB. I don’t know how much RAM Oil uses (I don’t see a FreeBSD package for it?), but I doubt that this was a constraint that you cared about. Burning 1 MiB of RAM for a 10x speedup in a shell is an obvious thing to do now but would have made you very unpopular 30 years ago.

                    1. 2

                      Yeah the memory management in all shells is definitely oriented around their line-at-a-time nature, just like the C compilers. I definitely think it’s a good tradeoff to use more RAM and give precise Clang-like error messages with column numbers, which Oil does.

                      Although one of my conjectures is that you can do a lot with optimization at the metalanguage level. If you look at the bash source code, it’s not the kind of code that can be optimized well. It’s very repetitive and there are lots of correctness issues as well (e.g. as pointed out in the AOSA book chapter which I link on my blog).

                      So Oil’s interpreter is very straightforward and unoptimized, but the metalanguage of statically typed Python + ASDL allows some flexibility, like:

                      • interning strings at GC time, or even allocation time (which would make string equality less expensive)
                      • using 4 byte integers instead 8 byte pointers. This would make a big difference because the data structures are pointer rich. However it tends to “break” debugging so I’m not sure how I feel about it.
                        • Zig does this manually but loses type safety / debuggability because all your Foo* and Bar* just become int.
                      • Optimizing a single hash table data structure rather than the dozens and dozens of linked list traversals that all shells use

                      All of these things are further off than I thought they would be … but I still think it is a good idea to use the “executable spec” startegy, since codebases like bash tend to last 30 years or so, and are in pretty bad shape now. At a recent conference the maintainer emphasized that the possibility of breakage is one reason that it moves relatively slowly and new features are rejected.

                      One conjecture I have about software is:

                      • Every widely used codebase that’s > 100K lines is 10x too slow in some important part, and it’s no longer feasible to optimize
                      • Every widely used codebase that’s > 1M lines is 100x too slow in some important part, …

                      (Although ironically even though bash’s man page says “it’s too big and too slow”, it’s actually small and fast compared to modern software!)

                      I think this could explain your pdflatex observations, although I know nothing about that codebase. Basically I am never surprised that when I write something “from scratch” that it is fast (even in plain Python!), simply because it’s 2K or 5K lines of code tuned to the problem, and existing software has grown all sorts of bells and whistles and deoptimizations!

                      Like just being within 10x of the hardware is damn good for most problems, and you even can do that in Python! (though the shell parser/interpreter was a notable exception to this! This problem is a little more demanding than I thought)

                      1. 4

                        Every widely used codebase that’s > 100K lines is 10x too slow in some important part, and it’s no longer feasible to optimize

                        That’s an interesting idea. I don’t think it’s universally true, but it does highlight the fact that designing to enable large-scale refactoring is probably the most important goal for long-term performance. Unfortunately I don’t think anyone actually knows how to do this. To give a concrete example, LLVM has the notion of function passes. These are transforms that run over a single function at a time. They are useful as an abstraction because they don’t invalidate the analysis results of any other function. At a high level, you might assume that you could then run function passes on all functions in a translation unit at a time. Unfortunately, there are some core elements of the design that make this impossible. The simplest one is that all values, including globals, have a use-def chain and adding (or removing) a use of a global in a function is permitted in a function pass and this would require synchronisation. If you were designing a new IR from scratch then you’d probably try to treat a function or a basic block as an atomic unit and require explicit synchronisation or communication to operate over more than one. LLVM has endured a lot of very invasive refactorings (at the moment, pointers are in the process of losing the pointee type as part of their type, which is a huge change) but the changes required to make it possible to parallelise this aspect of the compiler are too hard. Instead, it’s worked around with things like ThinLTO.

                        I think this could explain your pdflatex observations, although I know nothing about that codebase. Basically I am never surprised that when I write something “from scratch” that it is fast (even in plain Python!), simply because it’s 2K or 5K lines of code tuned to the problem, and existing software has grown all sorts of bells and whistles and deoptimizations!

                        There are two problems with [La]TeX. The first is that it’s really a programming language with some primitives that do layout. A TeX document is a program that is interpreted one character at a time with an interpreter that looks a lot like a Turing machine consuming its tape. Things like LaTeX and TikZ look like more modern programming or markup languages but they’re implemented entirely on top of this Turing-machine layer and so you can’t change that without breaking the entire ecosystem (and a full TeXLive install is a few GiBs of programs written in this language, so you really don’t want to do that).

                        The second is that TeX has amazing backwards compatibility guarantees for the output. You can take a TeX document from 1978 and typeset it with the latest version of TeX and get exactly the same visual output. A lot of the packages that exist have made implicit assumptions based on this guarantee and so even an opt-in change to the layout would break things in unexpected ways.

                        Somewhat related to the first point, TeX has a single-pass output concept baked in. Once output has been shipped to the device, it’s gone. SILE can do some impressive things because it treats the output as mutable until the program finishes executing. For example, in TeX, if you want to do a cross-reference to a page that hasn’t been typeset yet then you need to run TeX twice. The first time will emit the page numbers of all of the labels, the second time will insert them into the cross references. This is somewhat problematic because the first pass will put ‘page ?’ in the output and the second might put ‘page 100’ in the output, causing reflow and pushing the reference to a different place. In some cases this may then cause it to be updated to page 99, which would then cause reflow again. This is made worse by some of the packages that do things like ‘on the next page’ or ‘above’ or ‘on page 42 in section 3’ depending on the context and so can cause a lot of reflowing. In SILE, the code that updates these references can see the change to the layout and if it doesn’t reach a fixed point after a certain number of iterations then it can fall back to using a fixed-width representation of the cross-reference or adding a small amount of padding somewhere to prevent reflows.

                        1. 1

                          … designing to enable large-scale refactoring is probably the most important goal for long-term performance.

                          Yes! In the long run, architecture dominates performance. That is one thesis / goal behind Oil’s unusual implementation strategy – i.e. writing it in high level DSLs which translate to C++.

                          I’ve been able to refactor ~36K lines of code aggressively over 5 years, and keep working productively in it. I think that would have been impossible with 200K-300K lines of code. In my experience, that’s about the size where code takes on a will of its own :-)

                          (Bash is > 140K lines, and Oil implements much of it, and adds a rich language on top, so I think the project could have been 200K-300K lines of C++, if it didn’t fall over before then)

                          Another important thesis is that software architecture dominates language design. If you look at what features get added to say Python or Ruby, it’s often what is easy to implement. The Zen of Python even says this, which I quoted here: http://www.oilshell.org/blog/2021/11/recent-progress.html#how-osh-is-implemented-process-tools-and-techniques

                          When you add up that effect over 20-30 years, it’s profound!


                          The LLVM issues you mention remind me of the talks I watched on MLIR – Lattner listed a bunch of regrets with LLVM that he wants to fix with a new IR. Also I remember him saying a big flaw with Clang is that there is no C++ IR. That is, unlike Swift and the machine learning compiler he worked on at Google, LLVM itself is the Clang IR.

                          Also I do recall watching a video about pass reordering, although I don’t remember the details.


                          Yes to me it is amazing that TeX has survived for so long, AND that it still has those crazy limitations from hardware that no longer exists! Successful software lasts such a long time.

                          TeX and Oil have that in common – they have an unusual “metalanguage”! As I’m sure you know, in TeX it’s WEB and Pascal-H. I linked an informative comment below about that.

                          In Oil it’s statically typed Python, ASDL for algebraic types, and regular languages. It used to be called “OPy”, but I might call this collection of DSLs “Pea2” or something.

                          So now it seems very natural to mention that I’m trying to fund and hire a compiler engineer to speed up the Oil project:

                          https://github.com/oilshell/oil/wiki/Compiler-Engineer-Job (very rough draft)

                          (Your original comment about the dynamic parts of Objective C and their speed is very related!)

                          What I would like a compiler engineer to do is to rewrite a Python front end in Python, which is just 4K lines of code, but might end up at 8K.

                          And then enhance a 3K C++ runtime for garbage collected List<T> and `Dict<K, V>. And debug it! I spent most of my time in the debugger.

                          This task is already half done, passing 1131 out of ~1900 spec tests.

                          https://www.oilshell.org/release/0.9.6/pub/metrics.wwz/line-counts/for-translation.html

                          It seems like you have a lot of relevant expertise and probably know many people who could do this! It’s very much engineering, not research, although it seems to fall outside of what most open source contributors are up for.

                          I’m going to publicize this on my blog, but I’m letting people know ahead of time. I know there are many good compiler engineers who don’t read my blog, or who don’t read Hacker News, or who have never written open source (i.e. prefer being paid).

                          (To fund this, I applied for a $50K euro grant which I’ll be notified of by February, and I’m setting up Github sponsors. Progress will also be on the blog.)


                          Someone replied to me with nice info about TeX metalanguages: https://news.ycombinator.com/item?id=16526151

                          Today, major TeX distributions have their own Pascal(WEB)-to-C converters, written specifically for the TeX (and METAFONT) program. For example, TeX Live uses web2c[5], MiKTeX uses its own “C4P”[6], and even the more obscure distributions like KerTeX[7] have their own WEB/Pascal-to-C translators. One interesting project is web2w[8,9], which translates the TeX program from WEB (the Pascal-based literate programming system) to CWEB (the C-based literate programming system).

                          The only exception I’m aware of (that does not translate WEB or Pascal to C) is the TeX-GPC distribution [10,11,12], which makes only the changes needed to get the TeX program running with a modern Pascal compiler (GPC, GNU Pascal).

                    2. 4

                      Windows XP is an example of this, even though we used to make fun of Microsoft for making bloated code.

                      It doesn’t surprise me that we’d feel this way now. From memory (I didn’t like XP enough to have played with it in virtualization at any time since Windows software moved on from supporting it) Windows XP was slow for a few reasons:

                      1. It included slow features that its predecessor didn’t. Like web rendering on the desktop, indexing for search, additional visual effects in critical paths in the GUI, etc.
                      2. It needed a lot more RAM than NT4 or 2000 did. Many orgs had sized their PCs for NT 4 and tried to go straight to XP on the same hardware, and MS had been super conservative about minimum RAM requirements. So systems that met the minimums were miserable.
                      3. (related to 2) It had quite a bit more managed code in the desktop environment, which just chewed RAM.

                      If you tried to install it on a 16MB or 32MB system that seemed just fine with NT SP6 or 2k, you had a bad time. Now, as you point out, we just toss 256MB at it without thinking. Some of the systems in the field when it was released, that MS told us could run XP, could not take 256MB of RAM.

                      1. 2

                        I think you’re mis-remembering the memory requirements of 1990s WinNT a little bit. :-)

                        I deployed NT 3.1 in production. It just about ran in 16MB, and not well. 32MB was realistic.

                        NT 4 was OK in 32MB, decent in 64MB, and the last box I gave to someone had 80MB of RAM and it ran really quite well in that.

                        I deployed an officeful of Win2K boxes in 2000 on Athlons with 128MB of RAM, and 6mth later, I had to upgrade them all to 256MB to make it usable. (I was canny; I bought 256MB for half of them, and used the leftover RAM to upgrade the others, to minimise how annoyed my client was at needing to upgrade still-new PCs.)

                        XP in 128MB was painful, but it was just about doable in 192MB (the unofficial maxed-out capacity of my Acer-made Thinkpad i1200 series 1163G) and acceptable in 256MB.

                        For an experiment, I ran Windows 2000 (no SPs or anything) on a Thinkpad 701C – the famous Butterfly folding-keyboard machine – in 40MB of RAM. On a 486. It was just marginally usable if you were extremely patient: it booted, slowly, it logged in, very slowly, and it could connect to the Internet, extremely slowly.

                        1. 2

                          I will just believe you… I’m not going to test it :)

                          I remember that I had rooms full of PCs that were OK with either NT4 or 2K, and were pretty much unusable on XP despite vendor promises. The fact that I’ve forgotten the exact amounts of RAM where those lines fell is a blessing. I’m astonished but happy that I’ve finally forgotten… it was such a deeply ingrained thing for so long.

                          1. 2

                            :-D

                            That sounds perfectly fair! ;-)

                            The thing about RAM usage that surprised me in the early noughties was how much XP grew in its lifetime. When it was new, yeah, 256MB and it ran fairly well. Towards the end of its useful lifetime, you basically had to max out a machine to make it run decently – meaning, as it was effectively 32-bit only, 3 and a half (or so) gigs of RAM.

                            One of the things that finally killed XP was that XP64 was a whole different OS (a cut-down version of Windows Server 2003, IIRC) and needed new drivers and so on. So if you wanted good performance, you needed more RAM, and if you needed more than three-and-a-bit gigs of RAM, you had to go to a newer version of Windows to get a proper 64-bit OS.

                            For some brave souls that meant Vista (which, like Windows ME, was actually fairly OK after it received a bunch of updates). But for most, it meant Windows 7.

                            And that in turn is why XP was such a long-lived OS, as indeed was Win7.

                            Parenthetical P.S.: whereas, for comparison, a decent machine for Win7 in 2009 – say a Core i5 with 8GB of RAM – is still a perfectly usable Windows 10 21H2 machine now in 2022. Indeed I bought a couple of Thinkpads of that sort of vintage just a couple of months ago.

                        2. 1

                          Yeah I think all of that is true (although I don’t remember any managed code.) So I guess my point is that the amount of software bloat is just way worse now, so software with small amounts of bloat like XP seem ultra fast.

                          Related thread from a month ago about flatpak on Ubuntu:

                          https://lobste.rs/s/ljsx5r/flatpak_is_not_future#c_upxzcl

                          One commenter pointed out SSDs, which I agree is a big part of it, but I think we’ve had multiple hardware changes that are orders-of-magnitude increases since then (CPU, memory, network). And ALL of it has been chewed up by software. :-(

                          And I don’t think this is an unfair comparison, because Windows XP had networking and a web browser, unlike say comparing to Apple II. It is actually fairly on par with what a Linux desktop provides.

                      2. 4

                        I cut my teeth on AppleSoft BASIC in the 1980s. The only affordance for “structured programming” was GOSUB and the closest thing there was to an integrated assembler was a readily accessible system monitor where you could manually edit memory. The graphics primitives were extremely limited. (You could enable graphics modes, change colors, toggle pixels, and draw lines IIRC. You might have been able to fill regions, too, but I can’t swear to that.) For rich text, you could change foreground and background color. Various beeps were all you could do for sound, unless you wanted to POKE the hardware directly. If you did that you could do white noise and waveforms too. I don’t have enough time on the CoCo to say so with certainty, but I believe it was closer to the Apple experience than what you describe.

                        The thing that I miss about it most, and that I think has been lost to some degree, is that the system booted instantly to a prompt that expected you to program it. You had to do something else to do anything other than program the computer. That said, manually managing line numbers was no picnic. And I’m quite attached to things like visual editing and syntax highlighting these days. And while online help/autocomplete is easier than thumbing through my stack of paper documentation was, I might have learned more, more quickly, from that paper.

                        1. 2

                          Before Applesoft BASIC there was Integer BASIC, which came with the Mini-Assembler. It was very crappy though, and not a compelling alternative to graph paper and a copy of the instruction set. I remember a book on game programming on the Apple II that spent almost half the book writing an assembler in Applesoft BASIC, just to get to the good part!

                          1. 1

                            I remember Integer BASIC only because there were a few systems around our school where you needed to type “FP” to get to Applesoft before your programs would work. I don’t remember the Mini-Assembler at all.

                          2. 1

                            Color BASIC on the CoCo was from Microsoft, and it wasn’t too different from some of the other BASICs of the time, but did require a little adaptation for text-only stuff. Extended Color BASIC (extra cost option early on) added some graphics commands in various graphics modes. With either version of Color BASIC, the only structured programming was via GOSUB. Variable names were limited to one or two letters for floats and for strings.

                            Unfortunately, the CoCo didn’t ship with an assember / debugger built-in, you had to separately buy the EDTASM cartridge (or the later floppy disk version).

                          3. 4

                            My computers are fast moment: I was trying to get better image compression. I’ve discovered that an existing algorithm randomly generated better or worse results depending on hyperparameters, so I’ve just tried a bunch of them in a loop to find the best:

                            for(int i=0; i < 100; i++) {
                               result = try_with_parameter(i);
                               if (result > best) best = result;
                            }
                            

                            And it worked great, still under a second. Then I’ve found the original 1982 paper about this algorithm, where they said their Univesity Mainframe took 2 minutes per try, on way smaller data. Now I know why they hardcoded the parameters instead of finding the best one.

                            1. 12

                              A lot of the new and exciting work in compilers for the last 20 years has been implementing algorithms that were published in the ‘80s but ignored because they were infeasible on available hardware. When LLVM started doing LTO, folks complained that you couldn’t link a release build of Firefox on a 32-bit machine anymore. Now a typical dev machine for something as big as Firefox has at least 32 GiB of RAM and no one cares (thought they do care that fat LTO is single threaded). The entire C separation of preprocessor, compiler, assembler, and linker exists because each one of those could fit independently in RAM on a PDP-11 (and the separate link step originates because Mary Allen Wilkes didn’t have enough core memory on an IBM 704 to fit a Fortran program and all of the library routines that it might use). Being able to fit an entire program in memory in an intermediate representation over which you could do whole-program analysis was unimaginable.

                              TeX has a fantastic dynamic programming algorithm for finding optimal line breaking points in a paragraph. In the paper that presents the algorithm, it explains that it would also be ideal to use the same algorithm for laying out paragraphs on the page but doing this for a large document would require over a megabyte of memory and so is infeasible. SILE does the thing that the TeX authors wished they could do, using the algorithm exactly as they described in the paper.

                            2. 2

                              RIGHT!? A few years ago, I stumbled upon an old backup CD on which, around 2002 or so, I dumped a bunch of stuff from my older, aging Pentium II’s hard drive. This included a bunch of POV-Ray files that, I think, are from 1999 or so, one of which I distinctly recall taking about two hours to render at a nice resolution (800x600? I don’t think I’d have dared try 1024x768 on that). It was so slow that you could almost see every individual pixel coming up into existence. In a fit of nostalgia I downloaded a more recent version of POV-Ray and after some minor fiddling to get it working with modern POV-Ray versions, I tried to render it at 1024x768. It took a few seconds.

                              I was somewhat into 3D modelling at the time but I didn’t have the computer to match. Complicated scenes required some… creative fiddling. I’d do various parts of the scene in Moray 2 (anyone remember that?) in several separate files, so I could render them separately while working on them. That way it didn’t take forever to do a render. I don’t recall why (bugs in Moray? poor import/copy-paste support when working with multiple files?) but I’d then export all of these to POV-Ray, paste them together by hand, and then do a final render.

                              I don’t know what to think about language friendliness either, and especially programming environment friendliness. I’m too young for 1987 so I can’t speak for BASIC and the first lines of code I ever wrote were in Borland Pascal, I think. But newer environments weren’t all that bad either. My first real computer-related job had me doing things in Flash, which was at version… 4 or 5, I think, back then? Twenty years later, using three languages (HTML, CSS and JS), I think you can do everything you could do in Flash (a recent-ish development, though – CSS seems to have improved tremendously in the last 5-7 years), but with orders of magnitude more effort, and in a development environment that’s significantly worse in just about every way there is. Thank God that dreadful Flash plugin is dead, but still…

                              For a long time I though this was mostly a “by programmers, for programmers” thing – the inevitable march of progress inevitably gave rise to more complex tools, which not everyone could use, so we were generally better off, but non-programmers were not. For example, lots of people at my university lamented the obsolescence of Turbo C – they were electrical engineers who mostly cared about programming insofar as it allowed them to crunch numbers quickly and draw pretty graphics. Modern environments could do a lot more things, but you also paid the price of writing a lot more boilerplate in order to draw a few circles.

                              But after I’ve been at it for a while I’m not at all convinced things are quite that simple. For example, lots of popular application development platforms today don’t have a decent GUI builder, or any GUI builder at all, for that matter, and writing GUI applications feels like an odd mixture of “Holy crap the future is amazing!” and “How does this PDP-11 fit in such a small box!?”. Overall I do suppose we’re better off in most ways but there’s been plenty of churn that can’t be described as “progress” no matter how much you play with that word’s slippery definition.

                              Edit: on the other hand, there’s a fun thread over at the Retrocomputing SO about how NES games were developed. This is a development kit. Debugging involved quite some creativity. Now you can pause an emulator and poke through memory at will. Holy crap is the future awesome!

                              1. 1

                                I’ve been thinking about UI builders, and from my experience, I think they’ve fallen out of favor largely because the result is harder to maintain than a “UI as code” approach.

                                1. 4

                                  They haven’t really kept up with the general shift in development and business culture, that’s true. The “UI description in one file, logic in another file, with boilerplate to bind them” paradigm didn’t make things particularly easy to maintain, but it was also far more tolerable at a time when shifting things around in the UI was considered a pretty bad idea rather than an opportunity for an update (and beefing up some KPIs and so on).

                                  A great deal of usefulness at other development stages has been lost though. At one point we used to be able to literally sit in the same room as the UX folks (most of whom had formal, serious HCI education but that’s a whole other can of worms…) and hash out the user interfaces based on the first draft of a design. I don’t mean new wireframes or basic prototypes, I mean the actual UI. The feedback loop for many UI change proposals was on the order of minutes, and teaching people who weren’t coders how to try them out themselves essentially involved teaching them what flexible layouts are and how to drag’n’drop controls, as opposed to a myriad CSS and JS hacks. For a variety of reasons (including technology) interfaces are cheaper to design and implement today, but the whole process is a lot slower in my experience.

                            1. 6

                              Or, much easier to debug, comprehend, and work with:

                              sqlite> .import path/to/file.csv mytable
                              

                              https://www.sqlite.org/cli.html#importing_csv_files

                              1. 3

                                Back around 2009 I had a service crash because Erlang at the time (and still currently?) had a maximum of 65535 distinct atoms in a given BEAM.

                                That was fun. The error messages were a more formal version of “yeah you really shouldn’t do this.”

                                1. 9

                                  It still has a limit, but that limit is 2**20 aka 1M now. “Don’t create atoms based on user input” is still a best practice :)

                                1. 18
                                  1. 10

                                    This seemed like a wacky proposition before I understood that this is not an OS for servers, it’s an OS for server subcomponents.

                                    From the gut, I like the primitives that Hubris chose, and I pick up more than a whiff of Erlang’s influence. It will be interesting to see where else Hubris spreads in the coming years.

                                    1. 8

                                      I pick up more than a whiff of Erlang’s influence

                                      I think that our industry could do worse than to look at Erlang for ideas. There are probably a lot of problems in micro-service architectures that the Erlang folks figured out 25 years ago.

                                    1. 1

                                      It seems like your goals here are more exploratory than practical. Please correct me if I’m wrong.

                                      On the practical side: in the last year or two, I’d been searching for a more practical way to allow occasional dependency injection for testing purposes. I didn’t like using Mock or Mox much at all. Then a teammate introduced me to Mimic, and mocking no longer bothers me at all. Great tool. https://github.com/edgurgel/mimic

                                      1. 1

                                        Oh, mimic creates an alias to the module with an overloaded name. that kinda gives me the heebie jeebies, can’t quite put my finger on why though.

                                      1. 2

                                        I used it here and there. It was never very good, but it was a neat trick to run Finder on a Unix.

                                        I first used it around 1996, at work. I didn’t have to do much Unixy stuff on it; I get the impression now that it was there because of a military contract. I later used it in 1998-99 for fun in college, and was struck by how much better NetBSD was on the same machines.

                                        Now, I have an SE/30 and a Quadra 840av. A/UX doesn’t run on the latter, and I’d prefer not to waste the former on SVR2 :)

                                        1. 3

                                          Maybe I’m a bit crazy but I feel like if you’re programing elixir with message passing in mind, then you’re doing something wrong. 99% on your code should be purely functional and you should not be thinking about message passing (which is fundamentally stateful/effectful). Sure, it’s great to keep in mind that it’s happening behind the scenes, and that grokking the mechanism can give you confidence about the robustness of your code, but I do not architect the bulk of my code considering message passing, except if I’m digging really deep and writing e.g. a low-level tcp protocol implementation (which I almost never am).

                                          Is it different in the erlang community?

                                          1. 8

                                            Elixir and Erlang do not have a “pure FP” philosophy. Evaluation is eager, typing is dynamic, and side effects are basically never pushed into a monad in practice. Some of the core libraries even proudly expose global/cluster-wide state.

                                            The parts of FP that made it in (first-class functions, immutable data structures, certainly some other aspects I am missing) are there because they are useful for building systems that are resilient, debuggable, repairable, upgradable, and evolvable with ~zero downtime.

                                            That is the primary goal of Erlang, and Elixir inherits this legacy. It’s a goal, not a philosophy, so you may find competing ideas next to each other, because they’ve been shown to work well together in this context.

                                            1. 7

                                              total nitpick, but pure FP requires neither static typing nor lazyness.

                                              1. 2

                                                Also effects in FP languages can be, and usually are modeled using process calculi which are exactly what Erlang offers!

                                                That being said, Erlang also has side effects apart from message passing.

                                              2. 2

                                                never claimed it is pure FP. The VM is nice in that gives you places to breach FP pureness in exactly the right spots and makes it very hard or ugly to breaching pureness where doing so is dangerous.

                                                1. 4

                                                  My mistake, I thought you were surprised at message passing from a pure-FP point of view.

                                                  Another reason to think of message passing, and more broadly genservers/processes, in particular is that they can become bottlenecks if used carelessly. People talk about genservers as a building block of concurrency, which isn’t false, but from another point of view they are Erlang’s units of single-threaded computation. They only process one message at a time, and this is a feature if you know how/when to use it, but a drawback at other times. Effective Elixir or Erlang development must keep in mind the overall pattern of message passing largely to avoid this issue (and, in highly performance-sensitive cases, to avoid the overhead of message passing itself).

                                                  1. 1

                                                    Love reminding people that genservers are a unit of single threaded ness.

                                                    1. 1

                                                      I still can’t find a good word for it! https://twitter.com/gamache/status/1390326847662137355

                                              3. 6

                                                I can only offer my own experience (5 years of Erlang programming professionally).

                                                Message passing, like with FP, is a tool that can be (mis)used. Some of the best uses of multiple processes or message passing that I see in code are:

                                                • Enforcing a sequence on unordered events, or enforcing the single writer principle.
                                                • Bounded concurrency. Worker pools, queues, and more.
                                                • Implementing protocols. Protocols for passing messages between processes serve to standardize and abstract. Suppose you have a service on TCP that you want to extend to work over Websockets. The well-architected solution for this has 3 kinds of processes. 1 process that receives Erlang terms, and 2 processes that receive data along some transport (TCP, Websockets, etc.), and send Erlang terms. Structuring Erlang code in this way is an amazing aid in keeping code simple and organized.

                                                I’ll generally come across problems that are solved by processes/message passing when writing libraries. When writing application code that uses those libraries, it’s usually far less common.

                                                1. 4

                                                  my advice is typically:

                                                  • are you writing a library? You probably don’t need a genserver (except for odd things like “I need a “fire and forget” genserver to wrap an ets table, well, yeah).
                                                  • ok so you still think you need a genserver? did you try Task (this is elixir-specific)
                                                  • are you wrapping a stateful communication protocol? then go ahead use genserver.
                                                  • are you creating a ‘smart cache’ for something IRL or external to the vm? then go ahead and use genserver
                                                  • are you creating temporary shared state between users (like a chat room?) then go head and use genserver

                                                  I like the bounded concurrency one. Should probably add it to my list. Are you creating a rate limiter or resource pool? then use genserver.

                                                  1. 3

                                                    There is nothing wrong with using gen_server in library. The thing is that in most cases it is not you who should start and manage that process - leave it up to the user. The “problem” is that there are 3 “kinds” of projects in Erlang world:

                                                    • “libraries” which should not start their own supervisor in general and should leave all process management up to the end user
                                                    • “library applications” which should be mostly self contained, and are mostly independent from the main application, for example OpenTelemetry, systemd integration, Telemetry Poller, etc.
                                                    • “end-user applications” your application, where you handle all the data and processes on your own

                                                    In each of these parts there are different needs and message passing will be used more or less, depending on the needs.

                                                    1. 1

                                                      150% this. Dave Thomas got this trichotomy right in his empex talk, it’s just infuriating to me that his choice of “names” for these things is unnecessarily confusing.

                                                      1. 1

                                                        Sorry I mistyped… It should be “are you writing a library? If not, you should probably not be writing a genserver.

                                                  2. 3

                                                    In my experience (having done Elixir for 8 years), folks that don’t understand/think about messages and genservers in Elixir are at a severe disadvantage when debugging and grokking why their code is behaving some way.

                                                    It’s a lot like people who in the previous generation of webdevs learned Rails but not Ruby…which is fitting, since those are the folks driving adoption of Phoenix.

                                                    (There are also certainly people who reach for a genserver for everything, and that’s a whole additional annoying thing. Also the people who use Tasks when they really, really shouldn’t.)

                                                    1. 2

                                                      Processes are the core feature that provides both error isolation and fault tolerance. As you build more robust systems in Elixir, the number of times you use processes increases.

                                                      Often it is correct to abstract this away behind a module or API, but you’re still passing messages.

                                                      1. 1

                                                        I’ve not used Elixir, but it sounds from your description as if it has some abstractions that give you Erlang (technically Beam, I guess) processes but abstract away the message-passing details? The last time I wrote a non-trivial amount of Erlang code was about 15 years ago, but at least back then Erlang didn’t have anything like this. If you wanted concurrency, you used processes and message passing. If you didn’t want concurrency, there were several functional programming languages with much faster sequential execution than Erlang, so you’d probably chosen the wrong tool.

                                                      1. 14

                                                        All of the negative observations about the CSV format in this article are true. What is not mentioned is that Excel, Google Sheets, and the rest of the serious spreadsheet applications work around these problems as a matter of course, so no one cares, nor should they have to. Emit reasonably-formatted fields (ISO 8601 for datetime, etc), quote everything properly, and it will be perfectly usable on the other end if it belongs in a spreadsheet at all.

                                                        1. 4

                                                          Yet here I am, working with the local equivalent of Fortune 500 companies that send me ; separated ‘CSV’s that use the , as the decimal separator.

                                                          1. 2

                                                            Emit reasonably-formatted fields (ISO 8601 for datetime, etc), quote everything properly, and it will be perfectly usable

                                                            But people don’t do those things. Some people do, but certainly not all of them. That’s the entire point.

                                                          1. 4

                                                            Here’s a little context on Delta and the linked blog post:

                                                            Delta is a format to describe documents’ contents and how it changes over time. This is a core piece of technology at Slab, that powers our real-time collaboration engine, thanks to the built-in support for Operational Transform (think multiple users working together in Google docs).

                                                            Though we’ve been using it internally for almost 4 years now, we’re finally open-sourcing it to the wider Elixir community.

                                                            Feel free to reach out if you have any questions or feedback!

                                                            1. 1

                                                              What kind of networking does the OT use underneath? e.g. are you using some form of epidemic broadcast?

                                                              1. 1

                                                                This library just implements the core OT format and algorithms, and not any networking component.

                                                                Though at Slab, we use this with Phoenix Channels on top of a custom GenServer & Supervisor and it has worked out great for us.

                                                              2. 1

                                                                Do Google docs still use OTs? I thought they moved away from them about a decade ago because they hit the problem everyone hits just after they’ve invested a lot in OTs: that they hit combinatorial problems when you have a lot of operations and they all need to compose.

                                                                1. 2

                                                                  AFAIK Google Docs still uses OT, though they have probably moved away from the extremely rich data model supported by Wave early on. A visible artifact of this is that for each doc, there is a server coordinating updates, and with a large number of clients it can get overloaded (“Editing is temporarily disabled”). OT requires a centralized server to handle the propagation of updates.

                                                                  CRDTs, conversely, allow a flatter topology in which the server is more like just another client. As you can imagine, having a centralized authority simplifies things, so CRDTs are generally more complex than OTs with equivalent operations, and were for a long time considered infeasible for real-time editing and similar use cases.

                                                                  1. 1

                                                                    Yes AFAWK they still use OT and never moved for realtime text editing.

                                                                    To implement transform for OT you do need to consider all interactions between the different operations so the complexity is n^2 where n is the number of operations (although many combinations may be similar). Google Wave ran into this issue and was a common criticism even amongst their own engineering team. That’s why our design only includes 3 operations: insert, delete, and retain. A tradeoff example is: there is no replace operation. To effectively implement replace you would use insert + delete. There are subtle differences between a replace and a insert+delete but we did not feel they were worth the complexity it would create.

                                                                1. 3

                                                                  I find the COBOL code on the site much more readable than the mess of brackets that is Elixir.

                                                                  1. 2

                                                                    Do you mean the 4 curly braces in 20 lines of code? The translated code looks fine to me. Would have a lot more braces if it had C based syntax.

                                                                    1. 2

                                                                      In general, the thing I like most with COBOL is the (relative) lack of special characters. Sure, it could be worse.

                                                                    2. 2

                                                                      I did have the same thought. I had never actually seen COBOL before this article. It has a very SQL look to it, which is nice and clean. The contrast between the clean COBOL and the typical bracket-y language syntax is very stark.

                                                                      1. 2

                                                                        Readability was never one of the common criticisms of COBOL. Quite the reverse: it’s mainly criticised for being verbose. There was a joke that there would be an OO extension to COBOL, just as C++ was to C, called ADD ONE TO COBOL RETURNING COBOL. COBOL, by design, doesn’t have a lot of the syntactic sugar that other languages provide that make things more terse and so is very easy to read but more effort to write (though probably less so with a modern editor that can do a lot of autocompletion).

                                                                      2. 1

                                                                        Elixir accepts pull requests; perhaps you should work up a patch to banish tuple and map syntax to instead live under DATA DIVISION.

                                                                        1. 1

                                                                          I doubt they’d accept it.

                                                                      1. 3

                                                                        Python is really a shit language for obfuscating things. In the Perl community, obfuscated scripts that print “Just another Perl hacker” have been a tradition for decades, and the richness of the language really pays off when you’re trying to be a jerk.

                                                                        Fresh from 2009 or so, here’s my best take on the genre – this produces a rendering of my email sig at the time. I’m particularly proud of the base 26 conversion :)

                                                                        $q=q((pete gamache));$_=fpeoiaglclivdlcgglvqhcbwclemhcbwflclvqivcrgqhcbwafak
                                                                        ;sub A{for(($a=1)..pop){$a*=$a*=$a+$a*$a;$a%=999999}$a%128}sub b{@A=a..pop}$
                                                                        qq.=chr A(uc b($1))while s/(..)//;$qq="($qq)";&z;print"$q $qq 14 /Courier ";
                                                                        sub c{$x=$_[0];$x-=6.28until$x<6.28;my$a;for(0..10){$a+=($x**(2*$_)*(-1)**$_
                                                                        )/_(2*$_)}$a*2**(-$_[0]/8)}sub _{$_[0]<1?1:$_[0]*_($_[0]-1)}print join (' ',
                                                                        '66 666 translate 510 0 moveto 0 dup dup lineto 50 moveto',map{$_,c($_/8)*50
                                                                        ,'lineto'}(0..444)),' stroke findfont exch scalefont setfont 222 -30 moveto'
                                                                        ,' show 222 20 moveto show showpage';sub z{$q=~s/(g.+E)/$1, $1\@gmail.com/i}
                                                                        
                                                                        1. 3

                                                                          This has been my .signature for ages:

                                                                          #!/usr/bin/perl                             https://domm.plix.at
                                                                          for(ref bless{},just'another'perl'hacker){s-:+-$"-g&&print$_.$/}
                                                                          
                                                                          1. 3
                                                                          1. 8

                                                                            I kind of want a new tag for urbit just so I don’t need to see the political fight every single time it comes up.

                                                                            1. 15

                                                                              It’d be cool if we had a community notm to avoid politics when discussing tech, but there is a sizable contingent of Lobsters here who cannot function that way and prioritize their desire to talk about politics over talking about the tech.

                                                                              1. 12

                                                                                Lots of projects have creators with shitty opinions, but few of those projects represent and implement those opinions. Urbit is tech that implements a political view, and to consider the tech separately from the politics is as inane as considering the politics separately from the tech.

                                                                                1. 12

                                                                                  The politics of Urbit have been discussed many times over. There’s literally nothing new to talk about there. The tech, however, is ever evolving. Why not talk about that instead? Surely technical perspectives should be prioritized, at least here on Lobsters.

                                                                                  1. 12

                                                                                    The tech is “ever-evolving” I guess, but this article is a survey of the Urbit landscape, not a changelog.

                                                                                    Anyway: go ahead, talk about the tech. But in this project’s case, the tech and the politics are deliberately tangled, so don’t be surprised or disappointed when the politics of the project come up for discussion too.

                                                                                  2. 11

                                                                                    An interesting and enlightening discussion around Urbit’s politics would talk about what those politics are, how they have influenced the tech behind Urbit, and finally to what degree Urbit represents a successful realization of those politics. This is similar to how we evaluate deeply political technology projects like Project Cybersyn.

                                                                                    This is emphatically not the style of discussion we get here. Instead, people do the shallow (and perhaps even lazy) thing and make uninteresting (and often unsubstantiated and incoherent) moral claims, and then you just get a bunch of Lobsters clacking angrily at each other while high on self-righteousness. It’s all so tiresome.

                                                                                    1. 5

                                                                                      Such a discussion would be granting the project a legitimacy that it hasn’t earned and doesn’t deserve.

                                                                                      1. 3

                                                                                        When was the last time Lobsters talked about Cybersyn? When I searched I see a handful of posts with a handful of comments. That’s not really a discussion. So I don’t know where this deep evaluation is happening.

                                                                                        1. 5

                                                                                          I found this in my DB of all submissions to lobste.rs, it got 6 upvotes and 1 comment:

                                                                                          https://www.jacobinmag.com/2015/04/allende-chile-beer-medina-cybersyn/

                                                                                          Interestingly it’s one of six submissions from Jacobin, the last one was in Nov 2018.

                                                                                          Including this one, 23 submissions have “urbit” in their title.

                                                                                          1. 3

                                                                                            So I don’t know where this deep evaluation is happening.

                                                                                            That would be my point–it ain’t happening here, because in the vast majority of lobsters (quite probably myself included) are incapable of objective, dispassionate policy discussion.

                                                                                            1. 9

                                                                                              because in the vast majority of lobsters (quite probably myself included) are incapable of objective, dispassionate policy discussion.

                                                                                              “Objective, dispassionate policy discussion” isn’t some Platonic ideal that’s fit for all subject matter. When the topic is an elaborate prank schemed up by some Neoreactionary narcissist, the appropriate response is manifestly not a contemplative survey of its politics or positions.

                                                                                              1. 5

                                                                                                I’m also reasonably sure political discussion is somewhat off-topic here. Further, if a technical topic is sufficiently politicised I think the technical discussion itself becomes off-topic since no one can reliably talk about it in a neutral way that doesn’t in some ways refer back to the political context of the work.

                                                                                            2. 2

                                                                                              I don’t disagree that some (lots) of the political disussion that ensues after an Urbit posting is shallow, lazy, tiresome, content-free drivel. But the reason it doesn’t belong here isn’t because politics don’t belong in Lobste.rs discussions.

                                                                                              1. 3

                                                                                                The politics of urbit itself match your descriptors. Pointing that out as an FYI may be a simple observation, but that doesn’t make it unimportant.

                                                                                                People can still ignore those threads and discuss urbits’s whacky-ass tech as is currently the top post. Nobody is preventing that discussion from occurring.

                                                                                            3. 3

                                                                                              I never understood the “He’s right wing so these are the choices he made” argument. (oversimplified? Yes. But I’m OK with that.) The first time I read it, sure, interesting, engaging. The second time, I tried again, and again it seemed there was a massive cultural gap between what people thought and what I could understand. Third time, similar… and then I stopped caring. Urbit seemed to not be going anywhere, and caring about a random person’s political opinions because he happened to be a tech author seemed like a waste of time.

                                                                                              An 18 second sketch that illustrates what this all seems like to me: https://www.youtube.com/watch?v=79GNnfDrgWM

                                                                                            4. 5

                                                                                              You’re active on all political discussion I’ve seen here in recent memory, including sharing your own politics opinions. I chime in on this stuff too because it interests me. Why pretend to be above it? Clearly people want to discuss the tech implications of politically-adjacent stuff like this. It’s not virtuous to remove extra-technical context from the forum, it just makes the discussion less informed.

                                                                                              1. 1

                                                                                                I don’t understand your point. People want to flame about politics, not talk about the tech related aspects, or I’d have fewer problems with the chatter.

                                                                                                1. 3

                                                                                                  I’m not understanding how your pushback here relates to my point. But to respond - there’s plenty of tech discussion here by the people who want to be having it. I don’t see the political threads preventing people from engaging. The top two threads are about technical details. I don’t suppose that anyone extra would have jumped into the techy bits had nobody commented on the extra-techy bits. The site is designed to accommodate multiple threads of discussion. So the whole ‘politics distracts from the technical discussion’ is not something I’m buying in this case.

                                                                                          1. 6

                                                                                            Zig article -> 44 upvotes, Scala 3 -> 14 upvotes.

                                                                                            I am not sure what Scala is used for anymore. F# / Rust / Kotlin and probably OCaml ate away much of the user base over time. If you are an ex Scala coder, what language did you switch to?

                                                                                            1. 11

                                                                                              There’s 18 Apache projects written in Scala

                                                                                              • Apache CarbonData
                                                                                              • Apache Clerezza
                                                                                              • Apache Crunch (in the Attic)
                                                                                              • Apache cTAKES
                                                                                              • Apache Daffodil
                                                                                              • Apache ESME (in the Attic)
                                                                                              • Apache Flink
                                                                                              • Apache Hudi
                                                                                              • Apache Kafka
                                                                                              • Apache Polygene (in the Attic)
                                                                                              • Apache PredictionIO (in the Attic)
                                                                                              • Apache Samza
                                                                                              • Apache ServiceMix
                                                                                              • Apache Spark
                                                                                              • Apache Zeppelin

                                                                                              There’s some big names there (Spark, Kafka, Flink, Samza), especially in data movement. Also Netflix has atlas (time-series DB), Microsoft has hyperspace. Seems like most Scala code is associated to Spark in one way or another.

                                                                                              1. 2

                                                                                                Huh, I thought Kafka, Flink and Samza were written in Java. Shows what I know. Neat link!

                                                                                                1. 2

                                                                                                  This overstates the case for Scala a bit. Check the founding years of these projects. Kafka and Spark, which are two of the most popular projects in this list, were created in 2011 and 2014, at the height of Scala popularity. Both projects were written in Scala, but had to put significant effort into engineering a first-class pure Java API. Kafka team even rewrote the clients in Java eventually. GitHub repo analysis has the Kafka codebase as 70% Java and 23% Scala.

                                                                                                  It’s true that Spark does use Scala a bit more. GitHub there has Scala as 70% of codebase, with Python 13% and Java 8%. But Spark might just be the “perfect” use case for a language like Scala, being as focused as it is on concurrency, immutability, parallel computing, higher-order programming, etc.

                                                                                                  I also closely tracked development of Apache Storm (created 2011), and it started as a Clojure project, but was eventually rewritten (from scratch) in Java. There are lots of issues with infrastructure software not sticking with vanilla Java (or other “systems” languages like C, Go, etc.). Apache Cassandra and Elasticsearch stuck with pure Java, and had fewer such issues. Durability, simplicity, and ecosystem matter more than programming language features.

                                                                                              2. 8

                                                                                                It’s still pretty big in data engineering. Apache Spark was written in Scala.

                                                                                                1. 6

                                                                                                  The company I work for uses Scala for data engineering. I don’t think that team has any plans to move away from it. I suspect that the use of Scala is a product of the time: the company is about ten years old; Scala was chosen very early on. It was popular back then.

                                                                                                2. 7

                                                                                                  Elixir and loving it for almost 6 years now. I miss a couple of Scala’s features; in particular, implicits were nice for things like DI, execution contexts, etc. I don’t miss all the syntax and I certainly don’t miss all the Haskell that creeps in in a team setting.

                                                                                                  1. 6

                                                                                                    If you are an ex Scala coder, what language did you switch to?

                                                                                                    Python, now Go. At one point I figured I could ship entire features while waiting for my mid-sized Scala project to compile.

                                                                                                    I hope they addressed that problem.

                                                                                                    1. 3

                                                                                                      YAML in mid 2015 (I took a detour in infrastructure) but Clojure since 2019 now that I’m a dev again.

                                                                                                      FWIW I liked Scala as a “better Java” when I started using it around mid 2012, until early 2015 when I left that gig.

                                                                                                      I remember that I found it very difficult to navigate Scala’s matrix-style documentation; and that I hated implicits, and the operator precedence. I loved case classes, and I think I liked object classes (not sure that’s the right terminology). And I liked that vals were immutable.

                                                                                                      Compile times didn’t bother me that much, perhaps because I worked on early-stage greenfield projects with one or two other devs. (So didn’t have lots of code.)

                                                                                                      I liked programming with actors (we used Akka) but we found it difficult to monitor our services. Some devs were concerned about loss of type safety when using Akka actors.

                                                                                                      1. 2

                                                                                                        Akka actors are type-safe now for several years now.

                                                                                                        1. 2

                                                                                                          Off topic: there’s a lot of conversation/blog posts to be had about devs finding themselves in infrastructure. I’m there at the moment, and it’s a very different world.

                                                                                                        2. 3

                                                                                                          I’m using Scala for my hobby projects. It does all I need. I like multiple argument lists and ability to convert last argument to a block. Implicits are cool, though they need to be controlled, because sometimes they’re confusing; for example, I don’t really understand how uPickle library works inside, even though I know how to use it. Compilation times are not that bad as some people say; maybe they were bad in some early Scala versions, but they’re not as bad as e.g. C++. It works with Java libraries (though sometimes it’s awkward to use Java-style libs in Scala, but it’s the same story as using C from C++ – it’s a matter of creating some wrappers here and there).

                                                                                                          1. 3

                                                                                                            I wrote several big scala projects a decade ago (naggati, scrooge, kestrel) but stopped pretty much as soon as I stopped being paid to. Now I always reach for typescript, rust, or python for a new project. Of the three, rust seems the most obviously directly influenced by (the good parts of) scala, and would probably be the most natural migration.

                                                                                                            Others covered some of the biggest pain points, like incredibly long compile times and odd syntax. I’ll add:

                                                                                                            Java interoperability hurt. They couldn’t get rid of null, so you often needed to watch out and check for null, even if you were using Option. Same for all other java warts, like boxes and type erasure.

                                                                                                            They never said “no” to features. Even by 2011 the language was far too big to keep in your head, and different coders would write using different subsets, so it was often hard to understand other people’s code. (C++ has a similar problem.) Operator overloading, combined with the ability to make up operators from nothing (like <+-+>) meant that some libraries would encourage code that was literally unreadable. Implicits were useful to an extent, but required constant vigilance or you would be squinting at code at 3am going “where the heck did this function get imported from?”

                                                                                                            1. 2

                                                                                                              2 days later, Scala 3 -> 44 upvotes ;)

                                                                                                              1. 1

                                                                                                                I’m currently in Scala hiatus after nearly eight years as my primary stack with a splash of Rust, Ruby, Groovy, and a whole lot of shell scripting across several products at three companies. This included an IBM product that peaked near $100M/yr in sales. The major component I managed was JVM-only and 3/4 of it was Scala.

                                                                                                                For the last few months, I’m working in Python doing some PySpark and some Tensorflow and PyTorch computer vision stuff. While I concede that the Python ecosystem has certainly matured in the 15 years since I did anything material with it, my preference would be to rewrite everything I’m writing presently in Scala if the key libraries were available and their (re)implementations were mature.

                                                                                                              1. 5

                                                                                                                My Elixir deployment is mostly on ARM servers these days, so I have to wait a little bit before I can fully enjoy the JIT. But I am enjoying the ArgumentError improvements already – what a difference! Congrats to the team.

                                                                                                                1. 8

                                                                                                                  Congrats to the team! I love seeing the steady progress on the project and seeing these ideas explored on the BEAM. In particular, making all assignments expressions instead of statements is going to be a big quality-of-life improvement.

                                                                                                                  1. 6

                                                                                                                    I like lisp but macros should be a last resort thing. Is it really needed in those cases, I wonder.

                                                                                                                    1. 18

                                                                                                                      I disagree. Macros, if anything, are easier to reason about than functions, because in the vast majority of cases their expansions are deterministic, and in every situation they can be expanded and inspected at compile-time, before any code has run. The vast majority of bugs that I’ve made have been in normal application logic, not my macros - it’s much more difficult to reason about things whose interesting behavior is at run-time than at compile-time.

                                                                                                                      Moreover, most macros are limited to simple tree structure processing, which is far more constrained than all of the things you can get up to in your application code.

                                                                                                                      Can you make difficult-to-understand code with macros? Absolutely. However, the vast majority of Common Lisp code that I see is written by programmers disciplined enough to not do that - when you write good macros, they make code more readable.

                                                                                                                      1. 3

                                                                                                                        “Macros, if anything, are easier to reason about than functions, because in the vast majority of cases their expansions are deterministic, and in every situation they can be expanded and inspected at compile-time, before any code has run. The vast majority of bugs that I’ve made have been in normal application logic”

                                                                                                                        What you’ve just argued for are deterministic, simple functions whose behavior is understandable at compile time. They have the benefits you describe. Such code is common in real-time and safety/security-critical coding. An extra benefit is that static analysis, automated testing, and so on can easily flush bugs out in it. Tools that help optimize performance might also benefit from such code just due to easier analysis.

                                                                                                                        From there, there’s macros. The drawback of macros is they might not be understood instantly like a programmer will understand common, language constructs. If done right (esp names/docs), then this won’t be a problem. Next problem author already notes is that tooling breaks down on them. Although I didn’t prove it out, I hypothesized this process to make them reliable:

                                                                                                                        1. Write the code that the macros would output first on a few variations of inputs. Simple, deterministic functions operating on data. Make sure it has pre/post conditions and invariants. Make sure these pass above QA methods.

                                                                                                                        2. Write the same code operating on code (or trees or whatever) in an environment that allows similar compile-time QA. Port pre/post conditions and invariants to code form. Make sure that passes QA.

                                                                                                                        3. Make final macro that’s a mapping 1-to-1 of that to target language. This step can be eliminated where target language already has excellent QA tooling and macro support. Idk if any do, though.

                                                                                                                        4. Optionally, if the environment supports it, use an optimizing compiler on the macros integrated with the development environment so the code transformations run super-fast during development iterations. This was speculation on my part. I don’t know if any environment implements something like this. This could also be a preprocessing step.

                                                                                                                        The resulting macros using 1-3 should be more reliable than most functions people would’ve used in their place.

                                                                                                                        1. 2

                                                                                                                          What you’ve just argued for are deterministic, simple functions whose behavior is understandable at compile time.

                                                                                                                          In a very local sense, I agree with you - a simple function is easier to understand than a complex function.

                                                                                                                          However, that’s not a very interesting property.

                                                                                                                          A more interesting question/property is “Is a large, complex system made out of small, simpler functions easier to manipulate than one made from larger, more complex functions?”

                                                                                                                          My experience has been that, when I create lots of small, simple functions, the overall accidental complexity of the system increases. Ignoring that accidental complexity for the time being, all problems have some essential complexity to them. If you make smaller, simpler functions, you end up having to make more of them to implement your design in all of its essential complexity - which, in my experience, ends up adding far more accidental complexity due to indirection and abstraction than a smaller number of larger functions.

                                                                                                                          That aside, I think that your process for making macros more reliable is interesting - is it meant to make them more reliable for humans or to integrate tools with them better?

                                                                                                                          1. 1

                                                                                                                            “A more interesting question/property is “Is a large, complex system made out of small, simpler functions easier to manipulate than one made from larger, more complex functions?”

                                                                                                                            I think the question might be what is simple and what is complex? Another is simple for humans or machines? I liked the kinds of abstractions and generative techniques that let a human understand something that produced what was easy for a machine to work with. In general, I think the two often contradict.

                                                                                                                            That leads to your next point where increasing the number of simple functions actually made it more complex for you. That happened in formally-verified systems, too, where simplifications for proof assistants made it ugly for humans. I guess it should be as simple as it can be without causing extra problems. I have no precise measurement of that. Plus, more R&D invested in generative techniques that connect high-level, human-readable representations to machine-analyzable ones. Quick examples to make it clear might be Python vs C’s looping, parallel for in non-parallel language, or per-module choices for memory management (eg GC’s).

                                                                                                                            “is it meant to make them more reliable for humans or to integrate tools with them better?”

                                                                                                                            Just reliable in general: they do precisely what they’re specified to do. From there, humans or tools could use them. Humans will use them as they did before except with precise, behavioral information on them at the interface. Looking at contracts, tools already exist to generate tests or proof conditions from them.

                                                                                                                            Another benefit might be integration with machine learning to spot refactoring opportunities, esp if it’s simple swaps. For example, there’s a library function that does something, a macro that generates an optimized-for-machine version (eg parallelism), and the tool swaps them out based on both function signature and info in specification.

                                                                                                                      2. 7

                                                                                                                        Want to trade longer runtimes for longer compile times? There’s a tool for that. Need to execute a bit of code in the caller’s context, without forcing boilerplate on the developer? There’s a tool for that. Macros are a tool, not a last resort. I’m sure Grammarly’s code is no more of a monstrosity than you’d see at the equivalent Java shop, if the equivalent Java shop existed.

                                                                                                                        1. 9

                                                                                                                          Java shop would be using a bunch of annotations, dependency injection and similar compile time tricks with codegen. So still macros, just much less convenient to write :)

                                                                                                                          1. 1

                                                                                                                            the equivalent Java shop

                                                                                                                            I guess that would be Languagetool. How much of a monstrosity it is is left as an exercise to the reader, mostly because it’s free software and anybody can read it.

                                                                                                                          2. 7

                                                                                                                            This reminds me of when Paul Graham was bragging about how ViaWeb was like 25% macros and other lispers were kind of just looking on in horror trying to imagine what a headache it must be to debug.

                                                                                                                            1. 6

                                                                                                                              The source code of the Viaweb editor was probably about 20-25% macros. Macros are harder to write than ordinary Lisp functions, and it’s considered to be bad style to use them when they’re not necessary. So every macro in that code is there because it has to be. What that means is that at least 20-25% of the code in this program is doing things that you can’t easily do in any other language.

                                                                                                                              It’s such a bizarre argument.

                                                                                                                              1. 3

                                                                                                                                I find it persuasive. If a choice is made by someone who knows better, that choice probably has a good justification.

                                                                                                                                1. 11

                                                                                                                                  It’s a terrible argument; it jumps from “it’s considered to be bad style to use [macros] when they’re not necessary” straight to “therefore they must have been necessary” without even considering “therefore the code base exhibited bad style” which is far more likely. Typical pg arrogance and misdirection.

                                                                                                                                  1. 3

                                                                                                                                    I don’t have any insight into whether the macros are necessary; it’s the last statement I take issue with. For example: Haskell has a lot of complicated machinery for working with state and such that doesn’t exist in other languages, but that doesn’t mean those other languages can’t work with state. They just do it differently.

                                                                                                                                    Or to pick a more concrete example, the existence of the loop macro and the fact that it’s implemented as a macro doesn’t mean other languages can’t have powerful iteration capabilities.

                                                                                                                                    1. 1

                                                                                                                                      One hopes.