1.  

    I think rel="me" is a neat little concept. Had no idea that anyone was misunderstanding it as some sort of author attribution.

    I’m glad the author makes the point that the purpose of markup isn’t to appease Google. Still, I find it strange that so much supposedly “indie” web stuff seems to revolve around silos, and Twitter in particular. I’ve implemented some of it on my own site, but now that Mozilla Persona is dead their auth ideas all seem to rely on “log in via twitter” (or some other proprietary service) :(

    1. 1

      But the firm’s latest rankings, released last week, show Swift dropping to 11th place. Kotlin slipped from 27th place to 28th place. Though notable, slipping one place in one quarter doesn’t mean that Swift and Kotlin are in decline or even that they’ve peaked. “In general, we caution readers not to assign too much weight to small changes in the rankings; the differences between one spot or another, in general, tend to be slight,” RedMonk co-founder Stephen O’Grady wrote in a blog post analyzing the findings.

      They literally quote someone who says that articles like this don’t mean anything…

      1. 1

        I’ve always stuck with cron for scheduling. On my own laptop I use Task Spooler to queue up expensive commands so that they run one at a time in the background (e.g. large downloads and network transfers; as well as commands which may interfere if run concurrently)

        1. 4

          Never heard of it, but it seems like a super interesting approach to interactive environment. I cannot help but remember this Bret Victor’s talk about how we have been programming in almost-anachronistic ways with no innovation in the interfaces.

          1. 8

            There’s nothing obsolete about text. Visual languages don’t work. They’ve been tried hundreds of times, to no avail, because GUIs are fundamentally bad user interfaces for experienced users. Text is a better interface for power users, and programming languages are for power users.

            1. 14

              Why can’t I re-sort the definitions in my source instead of scrolling around then? Why is it hard to see a dependency graph for all my functions? Why do I have to jump between files all the time? Text - an interface for linear presentation of information - is fundamentally a kludge for code, which is anything but linear.

              1. 1

                Why can’t I re-sort the definitions in my source instead of scrolling around then?

                Sort them by what? It wouldn’t be difficult to write a script using the compiler module of Python to reorder the declarations in your file in an order you chose, which you could then use to replace the text of a buffer in your text editor. But usually I’d suggest what you want is to see a list of definitions in a particular order, which you could then use to jump to the definitions.

                In every case that I’ve seen of not using plain text, it inevitably become inscrutable. What is actually in my Smalltalk/Lisp image? What is actually there? What can people get out of it later when I deploy it?

                Why is it hard to see a dependency graph for all my functions?

                Because nobody has written something that will take your source files, determine their dependencies, and produce the DOT output (a very popular text-based format for graphs, far superior in my opinion to any binary graph description format) for that graph? It’s not like it’s particularly difficult.

                Why do I have to jump between files all the time?

                Because it turns out it’s useful to organise things into parts. Because it turns out it’s useful to be able to parallelise compilation and not reparse every bit of code you’ve ever written every time you change any part of it.


                I think that it’s definitely a requirement of any decent programming language to have a way to easily take the source code of that programming language and reify it into a syntax tree, for example. That’s very useful to have in a standard library. In Lisp it’s just read, Python has more complex syntax and requires more machinery which is in a standard library module, other languages have similar things.

                One point might be: maybe you don’t need a dependency graph if you can just make your code simpler, maybe you don’t need to jump around files much if your code is properly modularised (and you have a big enough screen and narrow enough maximum line length to have multiple files open at once), maybe sorting your definitions is wrong and what you want is a sortable list of declarations you can jump to the definitions.

                Not to mention that version control is important and version controlling things that aren’t text is a problem with conventional version control tools. Might not be an issue, you have your own VCS, but then you enter the land of expecting new users of your language to not only not use their standard editor, but also to not use their standard VCS, not use their standard pastebin, etc. How do you pastebin a snippet of a visual language so someone on an IRC channel can see it and give you help? How do you ask questions on StackOverflow about a visual language?

                It’s not even an issue of them being unusual and unsupported. By their very nature, not using text means that these languages aren’t compatible with generic tools for working with text. And never will be. That’s the thing about text, rather than having many many many binary formats and few tools, you have one binary format and many many tools.

                1. 8

                  Hey Miles, thanks for elaborating. I think we could have more interesting discussions if you give me a bit more credit and skip the trivial objections. You’re doing the same thing you did last time with C++ compilers. Yes, I know I could write a script, it’s not the point. I’m talking about interactive tools for source code analysis and manipulation, not a one-off sort.

                  I don’t agree with your objections about parallel compilation and parsing. It seems to me that you’re just thinking about existing tools and arguing from the status quo.

                  Further down, you make a suggestion which I interpret as “better languages could mitigate these issues” which is fair, but again I have to disagree because better languages always lead to more complex software which again requires better tools, so that’s a temporary solution at best.

                  You also raise a few objections, and here I should clarify that what I have in mind is not some kind of visual flowchart editor. What I’m claiming is that the conflation of internal representation and visual representation for code is counterproductive, but I think that a display representation that mostly looks like text is fine (as long as it’s actually within a structured editor). What I’m interested in is being able to manipulate symbols and units of code as well as aspects of its structure rather than individual characters.

                  Consequently, for pastebin or StackOverflow, you could just paste some text projection of the code, no problem. When it comes to VCS, well, the current situation is quite poor, so I’d welcome better tools there. For example, if there was a VCS that showed me diffs that take into account the semantics of the language (eg like this: https://www.semanticmerge.com), that would be pretty cool.

                  For the rest of your objections, I offer this analogy: imagine that we only had ASCII pictures, and none of this incompatible JPG/PSD/PNG nonsense with few complicated tools. Then we could use generic tools for working with text to manipulate these files, and we wouldn’t be constrained in any way whether we wanted to create beautiful paintings or complex diagrams. That’s the thing about text!

                  I think the practitioners and particularly academics in our field should have more sense of possibilities and less affection for things the way they are.

                  1. 1

                    When it comes to VCS, well, the current situation is quite poor, so I’d welcome better tools there.

                    Existing VCS could work reasonably well if the serialisation/“text projection” was deterministic and ‘stable’, i.e. minimising the amount of spurious changes like re-ordering of definitions, etc. As a first approximation I can imagine an s-expression language arranging the top-level expressions into lexicographic order, spreading them out so each sub-expression gets its own line, normalising all unquoted whitespace, etc. This would be like a very opinionated gofmt.

                    If users wan’t to preserve some layout/etc. then the editor can store that as metadata in the file. I agree that semantics-aware diffing would be great though ;)

                    1. 1

                      So you always end up separating the storage format from display representation in order to create better tools, which is exactly my point.

                      1. 1

                        Yes, I agree with your points. Was just remarking that some of these improvements (e.g. VCS) are easier to prototype and experiment with than others (e.g. semantics-aware queries of custom file formats).

                    2. 1

                      The way I see it is that there are tools for turning text into an AST and you can use them to build the fancy things you want. My point wasn’t ‘you can write that sort as a one-off’. You can edit code written in a text-based programming language with a really fancy editor that immediately parses it to an AST and works with it as an AST, and only turns it into text when written to disk. I have no problem with that. But really you’re still editing text when using something like paredit.

                      Something like vim but where the text objects are ‘identifier’, ‘ast node’, ‘expression’, ‘statement’, ‘logical line of code’, ‘block’, etc. rather than ‘text between word separators’, ‘text between spaces’, ‘line’, etc. would be a useful thing. In fact, you could probably do this in vim. I have an extension I use that lets you modify quotes around things taking into account escaped quotes within, etc. That’d probably work way better if it had that default structure for normal text and then could be customised to actually take into account the proper grammar of particular programming languages for which that is supported.

                      What I’m concerned about is the idea that it’s a good idea to store code in a proprietary binary file format that’s different for every language, where you can’t use the same tools with multiple languages. And then having to reimplement the same basic functionality for every single language in separate IDEs for each, where everything works slightly differently.

                      I do find it useful that I can do ci( and vim will delete everything inside the nearest set of parentheses, properly taking into account nesting. So if I have (foo (hello 1 2 3) bar) and my cursor is on the a, it’ll delete everything, even though the nearest ( and ) are beside hello and not foo. That kind of thing, more structured editing? I’m all for that.

                      Consequently, for pastebin or StackOverflow, you could just paste some text projection of the code, no problem. When it comes to VCS, well, the current situation is quite poor, so I’d welcome better tools there. For example, if there was a VCS that showed me diffs that take into account the semantics of the language (eg like this: https://www.semanticmerge.com), that would be pretty cool.

                      Ultimately I think if you have a recognised standardised text projection of your code, you might as well just make that the standardised format for it, then your fancy editor or editor plugin can parse it into the structures it needs. This helps ensure you can edit code over SSH, and have a variety of editors compatible with it, rather than just the single language-designer-provided IDE.

                      One of the nice things about git is that it stores snapshots internally rather than diffs. So if you have a language-specific tool that can produce diffs that are better due to being informed by the grammar of the language (avoiding the problem of adding a function and the diff being ‘added a new closing brace to the previous function then writing a new function except for a closing brace’, for example), then you can do that! Change the diff algorithm.

                      For the rest of your objections, I offer this analogy: imagine that we only had ASCII pictures, and none of this incompatible JPG/PSD/PNG nonsense with few complicated tools. Then we could use generic tools for working with text to manipulate these files, and we wouldn’t be constrained in any way whether we wanted to create beautiful paintings or complex diagrams. That’s the thing about text!

                      Well I mean I do much prefer creating a graph by writing some code to emit DOT than by writing code to emit PNG. I did so just the other day in fact. http://rout.nz/nfa.svg. Thank god for graphviz, eh?

                      Note that there’s also for example farbfeld, and svg, for that matter: text-based formats for images. Just because it’s text underneath doesn’t mean it has to be rendered as ASCII art.

                      1. 1

                        Cool, I’m glad we can agree that better tools would be good to have.

                        As far as the storage format, I don’t actually have a clear preference. What’s clearly needed is a separation of storage format and visual representation. If we had that, arguments about tabs vs spaces, indent size, let/in vs where, line length, private methods first or public methods first, vertical vs horizontal space (and on and on) could be nullified because everybody could arrange things however they like. Why can’t we have even such simple conveniences? And that’s just the low hanging fruit, there are far more interesting operations and ways of looking at source that could be implemented.

                        The other day there was a link to someone’s experiment (https://github.com/forest-lang/forest-compiler) where they use one of the text projections as the storage format. That might work, but it seems to me that the way parsing currently happens, there’s a lot of unnecessary work as whole files are constantly being reparsed because there is no structure to determine the relevant scope. It seems that controlling operations on the AST and knowing which branches are affected could be a lot more efficient. I’m sure there’s plenty of literature of this - I’ll have to look for it (and maybe I’m wrong about this).

                        What I’m concerned about is the idea that it’s a good idea to store code in a proprietary binary file format that’s different for every language, where you can’t use the same tools with multiple languages. And then having to reimplement the same basic functionality for every single language in separate IDEs for each, where everything works slightly differently.

                        I understand your concern, but this sounds exactly like the current state of affairs (other than really basic stuff like syntax highlighting maybe). There’s a separate language plugin (or plugins) for every combination of editor/IDE and language, and people keep rewriting all that stuff every time a new editor becomes popular, don’t they?

                        One of the nice things about git is that it stores snapshots internally rather than diffs.

                        Sure, we can glean a bit more information from a pair of snapshots, but still not much. It’s still impossible to track a combination of “rename + change definition”, or to treat changes in the order of definitions as a no-op, for example. Whereas if we were tracking changes in a more structured way (node renamed, sub-nodes modified etc.), it seems like we could say a lot more meaningful things about the evolution of the tree.

                        Thank god for graphviz, eh?

                        Perhaps the analogy was unclear. Being able to write a set of instructions to generate an image with a piece of software has nothing to do with having identical storage format and visual representation. If we approached images the same way we approach code, we would only have ASCII images as the output format, because that’s what is directly editable with text tools. Since you see the merits of PNG and SVG, you’re agreeing that there’s merit in separating internal/storage representation from the output representation.

                        1. 1

                          What I’m concerned about is the idea that it’s a good idea to store code in a proprietary binary file format that’s different for every language

                          I might have missed something, but I didn’t see anyone proposing this.

                          In particular, my understanding of Luna is that the graphical and textual representations are actually isomorphic (i.e. one can be derived if given the other). This means we can think of the textual representation as the being both a traditional text-based programming language and as a “file format” for serialising the graphical programming language.

                          Likewise we can switch to a text view, use grep/sed/etc. as much as we like, then switch back to a graphical view if we want (assuming that the resulting text is syntactically valid).

                    3. 1

                      Tools that improve navigation within textual source have existed for a long time. I’ve been using cscope to bounce around in C and Javascript source bases for as long as I can remember. The more static structure a language has, the easier it is to build these tools without ambiguity. The text source part isn’t really an issue – indeed it enables ad hoc tooling experiments to be built with existing text management tools; e.g., grep.

                      1. 4

                        Those tools aren’t text, though. They’re other things the augment the experience over just using text which becomes an incidental form of storage. Tools might also use AST’s, objects, data flows, constraints, and so on. They might use anything from direct representation to templates to synthesis.

                        I think the parent’s point was just text by itself is far more limited than that. Each thing I mentioned is available in some programming environment with an advantage over text-driven development.

                        1. 1

                          I think it’s wrong to say that the text storage is incidental. Line-oriented text files are about the lowest common denominator way we have to store data like this.

                          For starters, it’s effectively human-readable – you can lift the hood up and look at what’s underneath, understanding the effect that each individual character has on the result. Any more complicated structure, as would be generally required to have a more machine-first structured approach to program storage, is not going to have that property; at least not to the same extent.

                          If this thread demonstrates anything, it’s that we all have (at times, starkly!) different preferences for software engineering tools. Falling back on a textual representation allows us to avoid the need to seek consensus on a standard set of tools – I can use the editor and code manipulation tools that make sense to me, and you can stick to what makes sense to you. I think a lot of the UNIX philosophy posturing ends up being revisionist bunk, but the idea that text is a pretty universal interface for data interchange isn’t completely without merit.

                          1. 6

                            The under-the-hood representation is binary-structured electricity that gets turned into human-readable text by parsing and display code. If already parsing it and writing display code, then one might just as well use a different encoding or structure. Text certainly has advantages as one encoding of many to have available. Plugins or input modules can take care of any conversions.

                            Text does often have tooling advantages in systems like UNIX built with it in mind, though.

                            1. 1

                              I think it’s a reductionist argument for the good-enough, hard earned status quo. I think it can be valid, but only within a very narrow perspective - operational and short term.

                              To my mind, your position is equivalent to this: we should only have ASCII images, and we don’t need any of that PNG/JPG/PSD stuff with complicated specialised tools. Instead, we can use generic text tools to make CAD drawings, diagrams, paintings - whatever. All of those things can be perfectly represented in ASCII, and the text tools will not limit us in any way!

                          2. 2

                            I want to search my code like a database, e.g. “show my where this identifier is used as a parameter to a function” - the tooling for text doesn’t support this. Structured tooling would be super useful.

                            1. 2

                              Many things can be “queried” with grep and regular expressions. Which is also great to find “similar occurrences” that need to be checked but are only related by some operators and function calls following another. But on the other hand I’d definitely argue that IDEs at least have a tiny representation of the current source file for navigation or something and that you can click some token and find its uses, definitions, implementations … But it only works if I disable the low power mode. And with my 8Gb RAM MacBook I sometimes have to kill the IDE before running the program to make sure I can still use it at the same time.

                              1. 7

                                Maybe if it wasn’t parsing and re-parsing massive amounts of text all the time, it would be more energy efficient…

                              2. 1

                                Exactly. And it could extend beyond search; code could be manipulated and organised in more powerful ways. We still have rudimentary support for refactoring in most IDEs, and so we keep going through files and manually making structurally similar changes one by one, for no reason other than the inadequate underlying representation used for code.

                                I could be wrong and maybe this is impossible to implement in any kind of general way beyond the few specific examples I’ve thought of, but I find it strange that most people dismiss the very possibility of anything better despite the fact that it’s obviously difficult and inconvenient to work with textual source code.

                                1. 1

                                  The version of cscope that I use does things of that nature. The list of queries it supports:

                                  Find this C symbol:
                                  Find this global definition:
                                  Find functions called by this function:
                                  Find functions calling this function:
                                  Find this text string:
                                  Change this text string:
                                  Find this egrep pattern:
                                  Find this file:
                                  Find files #including this file:
                                  Find assignments to this symbol:
                                  

                                  I use Find functions calling this function a lot, as well as Find assignments to this symbol. You could conceivably add more query types, and I’m certain there are other tools that are less to my admittedly terminal-heavy aesthetic preference that offer more flexible code search and analysis.

                                  The base structure of the software being textual doesn’t get in the way of this at all.

                                  1. 3

                                    Software isn’t textual. We read the text into structures. Our tools should make these structures easier to work with. We need data structures other than text as the common format.

                                    Can I take cscope’s output and filter down to “arguments where the identifiers are of even length”?

                                    1. 5

                                      Compilers and interpreters use structured representations because those representations are more practical for the purposes of compiling and interpreting. It’s not a given that structured data is the most practical form for authoring. It might be. But what the compiler/interpreter does is not evidence of that.

                                      1. 1

                                        Those representations are more practical for searching and manipulating. Try it!

                                      2. 1

                                        I would also be interested on your thoughts about Lisp where the code is already structured data. This is an interesting property of Lisp but it does not seem to make it clearly easier to use.

                                        1. 2

                                          but it does not seem to make it clearly easier to use.

                                          Sure it does: makes macros easier to write than a language not designed like that. Once macros are easy, you can extend the language to more easily express yourself. This is seen in the DSL’s of Common LISP, Rebol, and Racket. I also always mention sklogic’s tool since he DSL’s about everything with a LISP underneath for when they don’t work.

                                  2. 2

                                    Sure, but all of these tools (including IDEs) are complicated to implement, error-prone, and extremely underpowered. cscope is just a glorified grep unless I’m missing something (I haven’t used it, just looked it up). The fact that you bring it up as a good example attests to the fact that we’re still stuck somewhere near mid-twentieth century in terms of programming UI.

                                    1. 4

                                      I bring it up as a good example because I use it all the time to great effect while working on large scale software projects. It is relatively simple to understand what it does, it’s been relatively reliable in my experience, and it helps a lot in understanding the code I work on. I’ve also tried exuberant ctags on occasion, and it’s been pretty neat as well.

                                      I don’t feel stuck at all. In fact, I feel wary of people attempting to invalidate positive real world experiences with assertions that merely because something has been around for a long time that it’s not still a useful way to work.

                                2. 2

                                  Have you noted, that the Luna language has dual representation? Where each visual program has an immediate and easily editable text representation, and the same is true in the other direction as well? This is intended to be able to keep the benefits of the text interface, while adding the benefits of a visual representation! That’s actually the main idea behind Luna.

                                  1. 1

                                    What about the power users who use things like Excel or Salesforce? These are GUIs perfectly tailored to specific tasks. A DJ working with a sound board certainly wouldn’t want a textual interface.

                                    Textual interfaces are bad, but they are generic and easy to write. It’s a lot harder to make an intuitive GUI, let alone one that works on something as complex as a programming language. Idk if Luna is worthwhile, but text isn’t the best user interface possible imho

                                    1. 3

                                      DJs use physical interfaces, and the GUIs emulation of those physical interfaces are basically all terrible.

                                      I’ve never heard of anyone liking Salesforce, I think that must be Stockholm Syndrome. Excel’s primary problem in my opinion is that it has essentially no way of seeing how data is flowing around. If something had the kind of ‘reactive’ nature of Excel while being text-based I’d much prefer that.

                                      Textual interfaces are excellent. While there are tasks that benefit from a GUI - image editing for example - in most cases GUIs are a nicer way of representing things to a new user but are bad for power users. I wouldn’t expect first year computer science students to use vim, as it’s not beginner-friendly, but it’s by far the best text editor out there in the hands of an experienced user.

                                      1. 2

                                        I wouldn’t expect first year computer science students to use vim, as it’s not beginner-friendly, but it’s by far the best text editor out there in the hands of an experienced user.

                                        I’d call myself an “experienced user” of vim. I’ve written extensions, given workshops, and even written a language autoindent plugin, which anyone who’s done it knows is like shoving nails through your eyeballs. About once a year I get fed up with the limitations of text-only programming and try to find a good visual IDE, only to switch back when I can’t find any. Just because vim is the best we currently have doesn’t mean it’s actually any good. We deserve better.

                                        (For the record, vim isn’t beginner-unfriendly because it’s text only. It’s beginner-unfriendly because it’s UI is terrible and inconsistent and the features are all undiscoverable.)

                                        1. 2

                                          Most people don’t bother to learn vimscript properly, treating it much like people treated Javascript for years: a bunch of disparate bits they’ve picked up over time, with no unifying core. But once you actually learn it, it becomes much easier to use and more consistent. The difference between expressions and commands becomes sensible instead of seeming like an inconsistency.

                                          I never get fed up with the limitations of text-only programming, because I don’t think they exist. Could you elaborate on what you are saying those limitations are?

                                          And I totally, 100% disagree with any claim that vim’s UI is bad or inconsistent. On the contrary, it’s extremely consistent. It’s not a bunch of little individual inconsistent commands, it’s motions and text objects and such. It has extensive and well-written help. Compared to any other IDE I’ve used (a lot), it’s way more consistent. Every time I use a Mac program I’m surprised at how ad-hoc the random combinations of letters for shortcuts are. And everything requires modifier keys, which are written with ridiculous indecipherable symbols instead of ‘Ctrl’ ‘Shift’ ‘Alt’ etc. Given that Mac is generally considered to be very easy to use, I don’t think typical general consensus on ease of use is very instructive.

                                  2. 2

                                    Bret Victor explains the persistence of textual languages as resistance to change, drawing an equivalence between users of textual languages now and assembly programmers who scoffed at the first higher-level programming languages. But this thread is evidence that at least some people are interested in using a language that isn’t text-based. Not everyone is fairly characterized by Bret Victor’s generalization. So then why hasn’t that alternative emerged? There are plenty of niche languages that address a minority preference with reasonable rates of adoption. With the exception of Hypercard, I can’t think of viable graphical programming language. Even Realtalk, the language that runs Dynamicland (Bret Victor’s current focus), is text-based, being a superset of Lua. I keep hearing about how text-based languages are old-fashioned and should die out, but I never hear anything insightful about why this hasn’t happened naturally. I’m not denying that there are opportunities for big innovation but “make a visual programming language” seems like an increasingly naive or simplistic approach.

                                    1. 3

                                      I think it has to do with the malleability of text. There’s a basic set of symbols and one way to arrange them (sequentially.) Almost any problem can be encoded that way. Emacs’ excellent org-mode is a testament to the virtue of malleability.

                                      Excel also has that characteristic. Many, many kind of problems can be encoded in rectangles of text with formulas. (Though I might note that having more ways to arrange things allows new kinds of errors, as evidenced by the growing cluster of Excel features for tracing dependencies & finding errors.)

                                      Graphical languages are way less malleable. The language creator decides what elements, relations, and constraints are allowed. None of them let me redefine what a rectangle represents, or what relations are allowed between them. I think that’s why these languages can be great at solving one class of problem, but a different class of problem seems to require a totally different graphical language.

                                      1. 1

                                        My suspicion is that it’s because graphical languages merge functionality and aesthetics, meaning you have to think very, VERY hard about UI/UX and graphic design. You need to be doing that from the start to have a hope of it working out.

                                    1. 1

                                      I think there are actually two distinct forms of “reusability” in the wild, which I like to distinguish between.

                                      Some things are reusable because they’re so small and simple that they can slot into solutions for all sorts of problems. Examples include unix utilities, JSON/protobuf/etc., arrays, linked lists, filesystems, SQlite (RDBMS servers like Postgres are debatable, I personally wouldn’t call them “simple”), key/value stores, etc.

                                      This is a good form of reusability, since this sort of code tends to be solving a well-specified (although abstract) problem, and can often be considered “finished”. Pulling out such functionality from a larger program hence reduces the amount of stuff we need to care about.

                                      Other things are reusable because they have so many hooks/options that we can configure them to plug into all sorts of solutions. Examples include stereotypical “enterprise ready” software, towering monoliths of code like GCC and GHC, “frameworks” and “ecosystems” like Symfony, Drupal, Wordpress, etc.

                                      This second form of “reusability” is often underisable. We need to go out of our way to support other use-cases, which may not even exist. This sort of software is often modelling some concrete, domain-specific concern like “users” or “reports”, which are informal enough that the decisions encoded in this implementation are probably unsuitable for other applications, and hence the need to keep adding more and more parameters and overrides.

                                      I think there’s a correlation with how functional programming and object oriented programming tend to be practiced too. FP libraries usually provide some abstract entity, which may be simple, self-contained and reusable in the first sense. Whilst OOP libraries often provide interfaces to some concrete entity, like a Web service, with hooks and parameters to make it reusable in the second sense.

                                      I think this may be a consequence of their approaches to code layout, the expression problem, etc. OOP limits what we can do (the set of methods), but subclassing makes it easy to override and alter how those things are done. FP limits what there is (the data constructors), but makes it easy use those things in new ways.

                                      1. 6

                                        Extensibility and re-usability are potential goals for system boundaries only. ..creating something that’s re-usable is pretty inherently about creating something that’s a system boundary, to some degree or another. And if you’re knowingly working on a system boundary… you’re knowingly working on something that supposed to be re-usable already. It’s pretty redundant to hail it as a design goal.

                                        This is great. I’ve been collecting anti-reuse, anti-abstraction links. Everytime I add one to my collection I’m going to share the whole thing.

                                        1. 4

                                          The part you quoted is also followed by:

                                          System boundaries are the danger zones of design. To whatever extent possible, we don’t want to create them unnecessarily. Any mistakes we make there are frozen, under threat of expensive breaking changes to fix.

                                          This is a nice counter to the approach of “all classes should be isolated”, “inject everything”, “never use ‘new’”, “mock everything”, etc. that I’ve encoutered in old jobs. It turns the implementation detail of internal code organisation boundaries (i.e. classes) into faux system boundaries, which makes them much more rigid and burdensome to change, slowing us down and discouraging refactoring.

                                        1. 1

                                          Sounds like a pleasant experience. One thing I wanted to point out is that many FP practices/principles (single assignment, returning fresh data instead of mutating, etc.) can be followed in almost any language, in case any JS or TypeScript users didn’t want to switch to something like PureScript. I find that they tend to be good ideas even on their own, as long as it’s not going against the grain of surrounding code.

                                          1. 8

                                            I agree with the overall message, but I find some of the particulars to be non-issues. Personally, I instinctively stick to the following, and it avoids a lot of issues without having to switch to a “big boy” language quite as soon/often:

                                            • Always quote variable substitutions, e.g. "$foo" never $foo. I find it disingenuous when someone (not this author, but it’s common enough) complains about e.g. filenames with spaces, says they don’t want to quote everything because it’s ugly/hassle/etc., then claims to prefer a language like Python/Go/etc. where all strings must be quoted. Why is it ugly/hassle in a shell script but not in those other languages? Don’t think about it as “adding quotes”, just default to treating shells like any other language.
                                            • Never use word splitting. If the shell has arrays then it’s basically an anti-feature, so I just pretend that it doesn’t exist so that I’m never even tempted to use it. Again, just quote everything ;)
                                            • Treat each shell as a different language. The difficulty of making scripts “portable” is a common complaint about shells, which is valid but again rather disingenuous if the “solution” is to enforce a particular dependency like Go (for compiling, at least) or Python (which is usually shorthand for something like “CPython > 2.7 && < 3” or “CPython 3.5+ with these packages…”). If you’re going to force a particular interpreter for Python/etc., then why not enforce a particular interpreter for your shell scripts? If it’s fine to write a Python script, then it’s fine to write a “bash 4+ script”. Who cares if it doesn’t run on zsh or dash? Python scripts don’t run on zsh or dash either.
                                            • These days I tend to go even further and use Nix to bake-in the versions of bash, grep, sed, etc. that I’m using, with a pinned nixpkgs version. I also do this for python scripts, etc. as well. If I have a collection of scripts, I’ll also have Nix run Shellcheck on all of them as part of their “build”; that can help catch things like undefined variables too. I don’t tweak the settings: just abort on everything, and add # shellcheck disable=SC123 comments as appropriate.
                                            1. 4

                                              Don’t use N computers when 1 will do.

                                              This is so important, because people seem to forget that computers are fast. Really fast. Absolutely, mind-mindbogglingly fast. CPUs operate on a level of granularity of billionths of a second, while humans can barely sense thousandths of a second.

                                              The only reason computers are “slow” is because most software is crap. I mostly blame Windows and the Web, since they’ve mislead average computer users into believing they have to wait for bloat to run (rather than regularly shouting at developers to profile their code).

                                              An average laptop running F-Stack and a decent web server could host a website and handle millions of concurrent requests per second. Distributed systems on AWS are nice and all, but they’re way harder to build than systems that run on a single server. And they’re far harder to profile and optimize.

                                              /rant

                                              Out of curiosity, I wonder how the author managed to handle concurrency in SQLite, since it locks the whole DB on writes. He mentions writing multiple SQLite databases, but I’d be interested in more details.

                                              1. 1

                                                Hmm, not come across F-Stack before. Ironic, in the context of your rant, that opening their page showed me nothing but a loading animation for a few seconds :P

                                                1. 1

                                                  Out of curiosity, I wonder how the author managed to handle concurrency in SQLite, since it locks the whole DB on writes. He mentions writing multiple SQLite databases, but I’d be interested in more details.

                                                  He also mentions using a nifty new (first appeared in 2010 :)) SQLite feature called WAL (Write-Ahead Logging). It makes writers and readers not block each other, plus it makes writes so fast (essentially just an append to a log) that they very rarely block each other.

                                                  I’m not sure why it’s not on by default, but that’s probably just standard SQLite conservatism in practice.

                                                  https://www.sqlite.org/wal.html

                                                1. 2

                                                  Linters are great for running as git pre-commit hooks. I like the “out of the box” ideas given in this article: i.e. it’s not just about code style.

                                                  It’s important to keep these fast though. I treat slow linters like integration tests: have them run on a build server, and only run them locally when reproducing/debugging a failure.

                                                  1. 3

                                                    My own collection of PDFs is getting a little ridiculous:

                                                    $ find Documents/ -iname '*.pdf' | wc -l
                                                    2385
                                                    

                                                    Most of these are stored with the original filename, which can be nice in some ways (e.g. I’ve often tried to save a paper off ArXiv only to be told the file already exists!), but as you say they’re often useless from a human/semantic perspective.

                                                    I’ll certainly take a look at PaperBoy to help with this.

                                                    For anyone interested, I wrote up some tips for managing PDFs, including links to a bunch of scripts (and a few snippets of my own): http://chriswarbo.net/projects/pdf-tools.html

                                                    It was written a few years ago, so unfortunately some of the links are broken (e.g. the Nix packages have probably moved to http://chriswarbo.net/git/warbo-packages.git now)

                                                    1. 1

                                                      I make heavy use of assertions in many languages. A really easy way to get more context is to just dump out a map/dictionary of values. For example, in Python:

                                                      assert foo == bar, repr({
                                                        "error": "Foo and bar should not be equal",
                                                        "foo": foo,
                                                        "bar": bar,
                                                      })
                                                      

                                                      It’s easy enough to wrap this into a function (although that may lose line number information, depending on the language), e.g.:

                                                      def check(bool, info):
                                                        assert bool, repr(info)
                                                        return bool
                                                      

                                                      I do a similar thing in other languages by dumping out the context information as JSON. One thing to be careful about is whether that info itself will cause problems: huge values are annoying, but infinite/cyclic values may break the error message itself! Also in a lazy language we may have values in scope which haven’t yet been forced, and trying to write them out as JSON may trigger an error. I find it best to code defensively in these situations (assertions are useful precisely when we are wrong about what’s happening!), e.g. not including values which are only accessed after the assertion.

                                                      1. 5

                                                        For those trying to learn Nix, I highly recommend learning the Nix language using the REPL. If you’re on a 1.x version of Nix (or NixOS < 18.03) you can use the nix-repl command provided by the nix-repl package, e.g.

                                                        $ nix-shell -p nix-repl --run nix-repl
                                                        

                                                        If you’re on Nix 2.x or above (or NixOS 18.03 or later) this is now built-in to Nix, just run:

                                                        $ nix repl
                                                        

                                                        For Nix 2.x there’s also the --show-trace option which will give you more debug info if there’s a failure.

                                                        One gotcha with the REPL is that it will sit waiting for input if it’s given an incomplete expression, even if that expression is from an external file! For example:

                                                        $ echo "''unterminated string" > foo.nix
                                                        $ nix repl --show-trace
                                                        Welcome to Nix version 2.0.4. Type :? for help.
                                                        
                                                        nix-repl> import ./foo.nix
                                                        

                                                        This will hang, presumably waiting for the ''. This is usually obvious for things like strings, but less obvious for e.g. expressions which contain ; like with foo; bar or assert foo; bar (those semicolons aren’t terminators, they’re separators, hence the following expression (bar) is required, or else it’ll hang like above)

                                                        1. 13

                                                          Oh boy. Here, have a hopefully helpful anecdote…

                                                          I spent a lot of time, sweat, tears, and pain deciding to “properly” learn Nix to set up armokweb, aka lobste.rs plays Dwarf Fortress. I’ve yet to do a full writeup/postmortem (although I intend to), but I finished the first iteration of that project with a mostly-positive view of NixOS from no experience whatsoever.

                                                          First, let me get it out of the way: the documentation often sucks and you have to read other packages that do a similar thing to the package you want to write to create custom packages at all. There’s not really a “Nix package cookbook” like there should be. Nix Pills gets you part of the way there, but my perception is that the whole nixpkgs contribution process operates on screwing up and learning from maintainers who have more experience than you.

                                                          nixpkgs is beholden to how well the community maintains it, but this is similar to Arch’s situation with the AUR. If there were fewer maintainers, the AUR would not be as useful. Some Nix packages (including the dwarf-fortress one, which I ended up contributing to) had breaking bugs. I’m not sure I’m qualified to argue for how “mature” nixpkgs already is, but it’s definitely getting better.

                                                          On a more positive note, I like Nix’s immutable packages and declarative approach to configuration. I tested armokweb in a Nix VM, and it was near-trivial to refactor it so it ran in a container that could be included in a Nix config with two lines of code. Nix’s strong point, IMO, is that it encourages you to produce artifacts (Nix code) that can be used to reproduce a system configuration on another machine, or in a container, or in some other context within the system. Coming from Debian, CentOS, and Arch, it’s a godsend that this is included in the OS and you don’t have to separate the steps of configuration and reproducing that configuration for others to use as much as you would with something like Ansible. It’s also great that you can mix rolling and non-rolling release models all you want. Your entire system doesn’t have to be “unstable” - just the packages and maybe their dependencies that need to will be if you decide to install an “unstable” package. There are no dependency diamonds that cause apt to complain that the wrong version of libc is installed if you accidentally add the wrong mirror.

                                                          Nix the language is sort of like a weird Python/ocaml hybrid, but the lazy evaluation approach works really well for package management. Derivations stringify to their Nix store path, which is the content address of that package (and, when they’re depended on, they get loaded, either from local build via cryptographic content-hash or from cache). It’s why the NixOS cache can exist as effectively just that, a cache that just makes installing packages go faster because they’re on a nearby server instead of needing to be rebuilt locally.

                                                          Overall, I’m a fan of NixOS. The entirety of system configuration is filling in the blanks in a playbook that you can share with others. Packages have strong guarantees about their integrity. Nix the language is a little weird but workable once you’ve seen enough of it. Containers just use lxc and lots of hardlinks to function. I’d like to see deeper integration with ZFS for managing the Nix store. Currently, garbage collection is done at a user-facing FS level, when Sun Microsystems solved this problem years ago at the block level.

                                                          I think there’s plenty to like about what NixOS provides, with the caveat that they can do all this better down the road. I feel like it would be really informative for them to refactor the Nix store so it’s able to take advantage of the features of ZFS. OpenIndiana already uses ZFS to manage system “generations” in a similar way to how NixOS uses symlinks and hardlinks. They would be able to make their OS better if they learned more from IllumOS, but I have no doubt that they could tackle that if they wanted to.

                                                          1. 2

                                                            I’ve heard lots of good things about ZFS, but it smells a bit too monolithic in my (inexperienced) opinion. I can certainly imagine the benefits of setting it up on a server, for example; but I don’t like the sound of a general purpose, publically available, userspace application like Nix relying on its functionality (i.e. forcing users to adopt ZFS just to run that particular program).

                                                            Maybe NixOS, as a full-blown distro, could encourage use of ZFS in its installer; I think there’s still debate raging about the legalities of bundling things like that. Also, I’m perhaps a little out of touch with the NixOS installation process: I installed 16.03 in 2014 and have been upgrading rather than re-installing since then. Also my experience was a little unorthodox: I used the CD ISO image, but since I don’t have an optical drive I ran the installer from within qemu (potentially dangerous, since it was installing to the same /dev/sda drive as the host, and very slow since my CPU doesn’t have virtualisation extensions :P )

                                                            1. 2

                                                              it smells a bit too monolithic in my (inexperienced) opinion

                                                              It definitely is, but I also wouldn’t have as many reservations about putting a NixOS root on ZFS if it could be implemented that way. At present, it’s a file level GC on top of a block level GC. It doesn’t even have to be a hard dependency, especially if it’s implemented with ZFS channel programs, which are a new feature in ZoL that lets root run Lua scripts that implement custom ZFS behavior in the kernel. Read-only mounts of snapshots are just exactly how the Nix store’s implementation currently behaves, and Nix might as well take advantage of features of the filesystem if it has them available.

                                                          1. 2

                                                            Looks very nice. I’m currently using Hydra but it’s annoyingly heavyweight (postgres DB server, user account/authentication system, configured via destructive Web UI, no CLI, etc.). I think I’ll be switching to Laminar at some point ;)

                                                            1. 1

                                                              In case it’s useful, I’ve actually packaged this up for NixOS now (in my own repo, not upstream) http://chriswarbo.net/git/nix-config/git/branches/master/nixos/modules/laminar.nix.raw.html

                                                            1. 4

                                                              I really like ATS as a proof-of-concept, and as a way to avoid falsehoods like ‘low level == unsafe’, ‘safe != fast’, ‘C/C++ is the only way to do X’, etc. It’s a bit too verbose and fiddly for me (I’ve been spoiled by high level, garbage collected languages like Agda, Idris, etc.) but it’s nice to see people (other than its author) playing with it, since that’s the only way to find and smooth-over the rough edges (like with the atspkg tool described in this article!)

                                                              1. 5

                                                                I really want to love NixOS: the ideas, the tools, how things are supposed to work… All they propose sound like future to me. Be able to have my config, which defines how I want my computer to behave, and just plug it in all the machines I may need to use sounds mindblowing.

                                                                And personally, I am finding the learning curve to be steep as hell. Not only because the documentation seems to assume that the one reading is slightly familiar with the environment and how things work, also because I need to modify certain habits to make them work with NixOS. For example, one of the must-haves for me is my Emacs configured as I like. I can tell Nix to clone my Emacs configuration to the home folder, and it should already be able to start downloading the packages it needs; but in reality that is not trivial because it seems to expect the packages to be downloaded from the Nix configuration instead of the Emacs one (to ensure the system to be deterministic, it makes absolute sense). I am used to have everything available from everywhere, but NixOS has most things isolated by default to keep the purity.

                                                                I will keep on fighting with stuff until I find things out, but I am sure that as the project grows all these corners will be polished to make it more accesible to newcomers.

                                                                1. 5

                                                                  For what it’s worth, I’ve been a heavy user of Nix, NixOS and Emacs for years, but still haven’t bothered configuring Emacs with Nix. The Emacs package I use is emacs25.override { withGTK2 = false; withGTK3 = false; } (this causes it to compile with the lucid toolkit, avoiding http://bugzilla.gnome.org/show_bug.cgi?id=85715 ). I do everything else with a ~/.emacs.d that’s been growing for years, across various distros, and is a mixture of Emacs Prelude (which I started with), ELPA/MELPA/Marmalade and (more recently) use-package. I just install any dependencies into my user profile or NixOS systemPackages. Actually, I define a package called all which depends on everything I want; that way I can keep track of it in git, rather than using commands like nix-env which can cause junk to accumulate. It looks like this:

                                                                  with import <nixpkgs> {};
                                                                  buildEnv {
                                                                    name = "all";
                                                                    paths = [
                                                                      abiword
                                                                      arandr
                                                                      audacious
                                                                      cmus
                                                                      (emacs25.override { withGTK2 = false; withGTK3 = false; })
                                                                      gensgs
                                                                      mplayer
                                                                      picard
                                                                      vlc
                                                                      w3m
                                                                      # and so on
                                                                    ];
                                                                  }
                                                                  

                                                                  There are certainly some aspects of Nix which require “buy in” (it looks like Guix is slightly better in this regard), but there are others which allow “business as usual”.

                                                                  For example, if you want to make a Nix package that just runs some bash commands, you can try runCommand, e.g.

                                                                  with import <nixpkgs> {};
                                                                  runCommand "my-package-name" {} ''
                                                                    # put your bash commands here
                                                                    # the "result" of your package should be written to "$out"
                                                                    # for example
                                                                    mkdir -p "$out/bin"
                                                                    printf "#!/usr/bin/env bash\necho hello world\n" > "$out/bin/myFirstProgram"
                                                                  ''
                                                                  

                                                                  Whether this will work obviously depends on what the commands do, but if it works then it works (you can even run stuff like wget, git clone, etc. if you want to; although I’d include a comment like TODO: use fetchurl or fetchgit). If your scripts need env vars to be set, put them between the {}. If you want some particular program available, put buildInputs = [ your programs here ]; between the {}.

                                                                  Another example is programs which assume the normal FHS filesystem layout: making them work is sometimes as easy as using steam-run (e.g. https://www.reddit.com/r/NixOS/comments/8h1eu5/how_do_you_deal_with_software_that_is_not_well/ ).

                                                                  Whilst there’s complicated infrastructure in Nixpkgs to support packages which use Python, Haskell, autotools, etc. sometimes we can get away without having to go ‘all the way’ :)

                                                                  1. 2

                                                                    Woah, thank you, that was super useful! I think I got it, but I still have to test it and have my own gotcha moments :)

                                                                  2. 4

                                                                    When starting out I just built a few packages from source in the traditional way to make them work the way I was used to, perhaps that could work with emacs and install into home initially. (I don’t use emacs, sorry I can’t help more.)

                                                                    1. 2

                                                                      You’re not alone - I installed NixOS recently and like what I’ve seen, but haven’t been able to put in enough time to get over the learning curve yet. Until I do, I’m fairly sure I’m missing several chances to “do things properly” because I’m not sure what that looks like under NixOS. This post and comments have been quite reassuring at least!

                                                                      I guess that’s the beauty of open source - now we all have to go and fix the documentation?

                                                                      1. 2

                                                                        I guess that’s the beauty of open source - now we all have to go and fix the documentation?

                                                                        Well… I guess. I’ll make some coffee.

                                                                    1. 5

                                                                      Git via email sounds like hell to me. I’ve tried to find some articles that evangelize the practice of doing software development tasks through email, but to no avail. What is the allure of approaches like this? What does it add to just using git by itself?

                                                                      1. 6

                                                                        I tried to collect the pros and cons in this article: https://begriffs.com/posts/2018-06-05-mailing-list-vs-github.html

                                                                        1. 3

                                                                          I also spoke about this at length in a previous article:

                                                                          https://drewdevault.com/2018/07/02/Email-driven-git.html

                                                                          1. 3

                                                                            While my general experience with git email is bad (it’s annoying to set up, especially in older versions and I don’t like it’s interface too much), my experience of interaction with projects that do this was generally good. You send a patch, you get review, you send a new, self-contained patch, attached to the same thread… etc, in parallel to the rest of the project discussion. It’s a different flavour, but with a project that is used to the flow, it can really be quite pleasing.

                                                                            1. 2

                                                                              What does it add to just using git by itself?

                                                                              I think the selling point is precisely that it doesn’t add anything else. Creating a PR involves more steps and context changes than git-format-patch git-send-mail.

                                                                              I have little experience using the mailing list flow, but when I had to do so (because the project required it) I found it very easy to use and better for code reviews.

                                                                              1. 1

                                                                                Creating a PR involves more steps and context changes than git-format-patch git-send-mail.

                                                                                I’m not sure I understand. What steps are removed that would otherwise be required?

                                                                                1. 4

                                                                                  Simply, it’s “create a fork and push your changes to it”. But also consider that it’s…

                                                                                  1. Open a web browser
                                                                                  2. Register for a GitHub account
                                                                                  3. Confirm your email address
                                                                                  4. Fork the repository
                                                                                  5. Push your changes to the fork
                                                                                  6. Open a pull request

                                                                                  In this workflow, you switched between your terminal, browser, mail client, browser, terminal, and browser before the pull request was sent.

                                                                                  With git send-email, it’s literally just git send-email HEAD^ to send the last commit, then you’re prompted for an email address, which you can obtain from git blame or the mailing list mentioned in the README. You can skip the second step next time by doing git config sendemail.to someone@example.org. Bonus: no proprietary software involved in the send-email workflow.

                                                                                  1. 3

                                                                                    Also github pull requests involve more git machinery than is necessary. Most people, when they open a PR, choose to make a feature branch in their fork from which to send the PR, rather than sending from master. The PR exposes the sender’s local branching choices unnecessarily. Then, for each PR, github creates more refs on the remote, so you end up having lots stuff laying around (try running git ls-remote | grep pull).

                                                                                    Compare that with the idea that if you want to send a code change, just mail the project a description (diff) of the change. We all must be slightly brainwashed when that doesn’t seem like the most obvious thing to do.

                                                                                    In fact the sender wouldn’t even have to use git at all, they could download a recent code tarball (no need to clone the whole project history), make changes and run the diff command… Might not be a great way to do things for ongoing contributions, but works for a quick fix.

                                                                                    Of course opening the PR is just the start of the future stymied github interactions.

                                                                                    1. 3

                                                                                      In my case I tend to also perform steps:

                                                                                      • 3.1 Clone project
                                                                                      • 3.2 Use project for a while
                                                                                      • 3.3 Make some local changes
                                                                                      • 3.4 Commit those changes to local clone
                                                                                      • 3.5 Try to open pull request
                                                                                      • 3.6 Realise GitHub requires me to make a fork of the original repo
                                                                                      • 4.1 Read man git-remote to see how to point my local clone (with the changes) to my GitHub fork
                                                                                      • 4.2 Run relevant git remote commands
                                                                                      • 4.3 Read man git-push to see how to send my changes to the fork rather than the original repo
                                                                                      1. 2

                                                                                        To send email, you also have to have an email address. If we are doing a fair comparison, that should be noted as well. Granted, it is much more likely that someone has an email address than a GitHub account, but the wonderful thing about both is that you only have to set them up once. So for this reason, it would be a bit more fair if the list above started from step four.

                                                                                        Now, if I have GitHub integration in my IDE (which is not an unreasonable thing to assume), then I do not need to leave the IDE at all, and I can fork, push, and open a PR (case in point, Emacs and Magithub can do this). I can also do all of this on GitHub, never leaving my browser. I don’t have to figure out where to send an email, because it automatically sends the PR to the repo I forked from. I don’t even need to open a shell and deal with the commandline. I can do everything with shortcuts and a little bit of mousing around, in both the IDE and the browser case.

                                                                                        Even as someone who is familiar with the commandline, and is sufficiently savvy with e-mail (at one point I was subscribed to debian-bugs-dist AND LKML, among other things, and had no problem filtering out the few bits I needed), I’d rather work without having to send patches, using Magit + magithub instead. It’s better integrated, hides uninteresting details from me, so I can get done with my work faster. It works out of the box. git send-email does not, it requires a whole lot of set up per repo.

                                                                                        Furthermore, with e-mail, you have to handle replies, have a firm grip on your inbox. That’s an art on its own. No such issue with GitHub.

                                                                                        With this in mind, the remaining benefit of git send-email is that it does not involve a proprietary platform. For a whole lot of people, that’s not an interesting property.

                                                                                        1. 2

                                                                                          To send email, you also have to have an email address. If we are doing a fair comparison, that should be noted as well.

                                                                                          I did note this:

                                                                                          then you’re prompted for an email address, which you can obtain from git blame or the mailing list mentioned in the README

                                                                                          Magit + magithub […] works out of the box

                                                                                          Only if you have a GitHub account and authorize it. Which is a similar amount of setup, if not more, compared to setting up git send-email with your SMTP info.

                                                                                          git send-email does not, it requires a whole lot of set up per repo

                                                                                          You only have to put your SMTP creds in once. Then all you have to do per-repo is decide where to send the email to. How is this more work than making a GitHub fork? All of this works without installing extra software to boot.

                                                                                          1. 3

                                                                                            then you’re prompted for an email address, which you can obtain from git blame or the mailing list mentioned in the README

                                                                                            With GitHub, I do not need to obtain any email address, or dig it out of a README. It sets things up automatically for me so I can just open a PR, and have everything filled out.

                                                                                            Only if you have a GitHub account and authorize it. Which is a similar amount of setup, if not more, compared to setting up git send-email with your SMTP info.

                                                                                            Lets compare:

                                                                                            e-mail:

                                                                                            1. Clone repository
                                                                                            2. Do my business
                                                                                            3. Figure out where to send e-mail to.
                                                                                            4. git config so I won’t have to figure it out ever again.
                                                                                            5. git send-email

                                                                                            magithub:

                                                                                            1. clone repo
                                                                                            2. do my business
                                                                                            3. fork the repo
                                                                                            4. push changes
                                                                                            5. open PR

                                                                                            The first two steps are pretty much the same, both are easily assisted by my IDE. The difference starts from step 3, because my IDE can’t figure out for me where to send the email. That’s a manual step. I can create a helper that makes it easier for me to do step 4 once I have the address, but that’s about it. For the magithub case, step 3 is SPC g h f; step 4 SPC g s p u RET; step 5 SPC g h p, then edit the cover letter, and , c (or C-c) to finish it up and send it. You can use whatever shortcuts you set up, these are mine. Nothing to figure out manually, all automated. All I have to do is invoke a shortcut, edit the cover letter (the PR’s body), and I’m done.

                                                                                            I can even automate the clone + fork part, and combine push changes + open PR, so it becomes:

                                                                                            1. fork & clone repo (or clone if already forked)
                                                                                            2. do my business
                                                                                            3. push changes & open PR

                                                                                            Can’t do such automation with e-mailed patches.

                                                                                            I’m not counting GitHub account authorization, because that’s about the same complexity as configuring auth for my SMTP, and both have to be done only once. I’m also not counting registering a GitHub account, because that only needs to be done once, and you can use it forever, for any GitHub-hosted repo, and takes about a minute, a miniscule amount compared to doing actual development.

                                                                                            Again, the main difference is that for the e-mail workflow, I have to figure out the e-mail address, a process that’s longer than forking the repo and pushing my changes, and a process that can’t be automated to the point of requiring a single shortcut.

                                                                                            Then all you have to do per-repo is decide where to send the email to. How is this more work than making a GitHub fork?

                                                                                            Creating a GitHub fork is literally one shortcut, or one click in the browser. If you can’t see how that is considerably easier than digging out email addresses from free-form text, then I have nothing more to say.

                                                                                            And we haven’t talked about receiving comments on the email yet, or accepting patches. Oh boy.

                                                                                            1. 2

                                                                                              With GitHub, I do not need to obtain any email address, or dig it out of a README. It sets things up automatically for me so I can just open a PR, and have everything filled out.

                                                                                              You already had to read the README to figure out how to compile it, and check if there was a style guide, and review guidelines for contribution…

                                                                                              Lets compare

                                                                                              Note that your magithub process is the same number of steps but none of them have “so I won’t have to figure it out ever again”, which on the email process actually eliminates two of your steps.

                                                                                              Your magithub workflow looks much more complicated, and you could use keybindings to plug into send-email as well.

                                                                                              Can’t do such automation with e-mailed patches

                                                                                              You can do this and even more!

                                                                                              1. 2

                                                                                                You already had to read the README to figure out how to compile it, and check if there was a style guide, and review guidelines for contribution…

                                                                                                I might have read the README, or skimmed it. But not to figure out how to compile - most languages have a reasonably standardised way of doing things. If a particular project does not follow that, I will most likely just stop caring unless I really, really need to compile it for one reason or another. For style, I hope they have tooling to enforce it, or at least check it, so I don’t have to read long documents and keep it in my head. I have more important things to store there than things that should be automated.

                                                                                                I would likely read the contributing guidelines, but I won’t memorize it, and I certainly won’t try to remember an e-mail address. I might remember where to find it, but it will still be a manual process. Not a terribly long process, but noticeably longer than not having to do it at all.

                                                                                                Note that your magithub process is the same number of steps but none of them have “so I won’t have to figure it out ever again”, which on the email process actually eliminates two of your steps.

                                                                                                Because there’s nothing for me to figure out at all, ever (apart from what repo to clone & fork, but that’s a common step between the two workflows).

                                                                                                Your magithub workflow looks much more complicated

                                                                                                How is it more complicated? Clone, work, fork, push, open PR (or clone+fork, work, push+PR), of which all but “work” is heavily assisted. None of it requires me to look anything up, anywhere.

                                                                                                and you could use keybindings to plug into send-email as well.

                                                                                                And I do, when I’m dealing with projects that use an e-mail workflow. It’s not about shortcuts, but what can be automated, what the IDE can do instead of requiring me to do it.

                                                                                                You can do this and even more!

                                                                                                You can, if you can extract the address to send patches to automatically. You can build something that does that, but then the automation is tied to that platform, just like the PRs are tied to GitHub/GitLab/whatever.

                                                                                                And again, this is just about sending a patch/opening a PR. There’s so much more PRs provide than that. Some of that, you can do with e-mail. Most of it, you can build on top of e-mail. But once you build something on top of e-mail, you no longer have an e-mail workflow, you have a different platform with which you can interact via e-mail. Think issues, labels for them, reviews (with approvals, rejection, etc - all of which must be discoverable by programs reliably), new commits, rebases and whatnot… yeah, you can build all of this on top of e-mail, and provide a web UI or an API or tools or whatever to present the current state (or any prior state). But then you built a platform which requires special tooling to use to its full potential, and you’re not much better than GitHub. You might build free software, but then there’s GitLab, Gitea, Gogs and a whole lot of others which do many of these things already, and are almost as easy to use as GitHub.

                                                                                                I’ve worked with patches sent via e-mail quite a bit in the past. One can make it work, but it requires a lot of careful thought and setup to make it convenient. I’ll give a few examples!

                                                                                                With GitHub and the like, it is reasonably easy to have an overview of open pull requests, without subscribing to a mailing list, or browsing archives. An open PR list is much easier to glance at and have a rough idea than a mailing list. PRs can have labels to help in figuring out what part of the repo they touch, or what state they are in. They can have CI states attached. At a glance, you get a whole lot of information. With a mailing list, you don’t have that. You can build something on top of e-mail that gives you a similar overview, but then you are not using e-mail only, and will need special tooling to process the information further (eg, to limit open PRs to those that need a review, for example).

                                                                                                With GitHub and the like, you can subscribe to issues and pull requests, and you’ll get notifcations about those and those alone. With a mailing list, you rarely have that option, and must do filtering on your own, and hope that there’s a reasonable convention that allows you to do so reliably.

                                                                                                There’s a whole lot of other things that these tools provide over plain patches over email. Like I said before, most - if not all - of that can be built on top of e-mail, but to achieve the same level of convenience, you will end up with an API that isn’t e-mail. And then you have Yet Another Platform.

                                                                                                1. 2

                                                                                                  How is it more complicated? Clone, work, fork, push, open PR (or clone+fork, work, push+PR)

                                                                                                  Because the work for the send-email approach is: clone, work, git send-email. This is fewer steps and is therefore less complicated. Not to mention that as projects become more decentralized as they move away from GItHub, the registration process doesn’t go away and starts recurring for every new forge or instance of a forge you work with.

                                                                                                  But once you build something on top of e-mail, you no longer have an e-mail workflow, you have a different platform with which you can interact via e-mail. Think issues, labels for them, reviews (with approvals, rejection, etc - all of which must be discoverable by programs reliably), new commits, rebases and whatnot…

                                                                                                  Yes, that’s what I’m advocating for.

                                                                                                  But then you built a platform which requires special tooling to use to its full potential, and you’re not much better than GitHub

                                                                                                  No, I’m proposing all of this can be done with a very similar UX on the web and be driven by email underneath.

                                                                                                  PRs can have labels to help in figuring out what part of the repo they touch, or what state they are in. They can have CI states attached.

                                                                                                  So let’s add that to mailing list software. I explicitly acknoweldge the shortcomings of mail today and posit that we should invest in these areas rather than rebuilding from scratch without an email-based foundation. But none of the problems you bring up are problems that can’t be solved with email. They’re just problems which haven’t been solved with emails. Problems I am solving with emails. Read my article!

                                                                                                  but then you are not using e-mail only, and will need special tooling to process the information further (eg, to limit open PRs to those that need a review, for example).

                                                                                                  So what? Why is this even a little bit of a problem? What the hell?

                                                                                                  With GitHub and the like, you can subscribe to issues and pull requests, and you’ll get notifcations about those and those alone.

                                                                                                  You can’t subscribe to issues or pull requests, you have to subscribe to both, plus new releases. Mailing lists are more flexible in this respect. There are often separate thing-announce, thing-discuss (or thing-users), and thing-dev mailing lists which you can subscribe to separately depending on what you want to hear about.

                                                                                                  Like I said before, most - if not all - of that can be built on top of e-mail, but to achieve the same level of convenience, you will end up with an API that isn’t e-mail.

                                                                                                  No, you won’t. That’s simply not how this works.

                                                                                                  Look, we’re just not on the same wavelength here. I’m not going to continue diving into this ditch of meaningless argument. You keep using whatever you’re comfortable with.

                                                                                                2. 2

                                                                                                  Your magithub workflow looks much more complicated, and you could use keybindings to plug into send-email as well.

                                                                                                  I just remembered a good illustration that might explain my stance a bit better. My wife, a garden engineer, was able to contribute to a few projects during Hacktoberfest (three years in a row now), with only a browser and GitHub for Windows at hand. She couldn’t have done it via e-mail, because the only way she can use her email is via her smart phone, or GMail’s web interface. She knows nothing else, and is not interested in learning anything else either, because these perfectly suit her needs. Yet, she was able to discover projects (by looking at what I contributed to, or have starred), search for TODOs or look at existing issues, fork a repo, write some documentation, and submit a PR. She could have done it all from a web browser, but I set up GitHub for Windows for her - in hindsight, I should have let her just use the browser. We’ll do that this year.

                                                                                                  She doesn’t know how to use the command-line, has no desire, and no need to learn it. Her email handling is… something that makes me want to scream (no filters, no labels, no folders - one big, unorganized inbox), but it suits her, and as such, she has no desire to change it in any way.

                                                                                                  She doesn’t know Emacs, or any IDE for that matter, and has no real need for them, either.

                                                                                                  Yet, her contributions were well received, they were useful, and some are still in place today, unchanged. Why? Because GitHub made it easy for newcomers to contribute. They made it so that contributing does not require them to use anything else but GitHub. This is a pretty strong selling point for many people, that using GitHub (and similar solutions) does not affect any other tool or service they use. It’s distinct, and separate.

                                                                                                  1. 2

                                                                                                    Not all projects have work for unskilled contributors. Why should we cater to them (who on the whole do <1% of the work) at the expense of the skilled contributors? Particularly the most senior contributors, who in practice do 90% of the work. We don’t build houses with toy hammers so that your grandma can contribute.

                                                                                                    I’m not saying we shouldn’t make tools which accomodate everyone. I’m saying we should make tools that accomodate skilled engineers and build simpler tools on top of that. Thus, the skilled engineers are not slowed down and the greener contributors can still get work done. Then, there’s a path for newer users to become more exposed to more powerful tools and more smoothly become senior contributors themselves.

                                                                                                    You need to get this point down if you want me to keep entertaining a discussion with you: you can build the same easy-to-use UX and drive it with email.

                                                                                                    1. 3

                                                                                                      I’m not saying we shouldn’t make tools which accomodate everyone. I’m saying we should make tools that accomodate skilled engineers and build simpler tools on top of that.

                                                                                                      I was under the impression that git + GitHub are exactly these. Git and git send-email for those who prefer those style, GitHub for those who prefer that. The skilled engineers can use the powerful tools they have, while those with a different skillset can use GitHub. All you need is willingness to work with both.

                                                                                                      you can build the same easy-to-use UX and drive it with email.

                                                                                                      I’m not questioning you can build something very similar, but as long as e-mail is the only driving power behind it, there will be plenty of people who will turn to some other tool. Because filtering email is something you and I can easily do, but many can’t, or aren’t willing to. Not when there are alternatives that don’t require them to do extra work.

                                                                                                      Mind you, I consider myself a skilled engineer, and I mainly use GitHub/GitLab APIs, because I don’t have to filter e-mail, nor parse the info in them, the API serves me data I can use in an easier manner. From an integrator point of view, this is golden. If, say, an Emacs integration starts with “Set up your email so mail with these properties are routed here”, that’s not a good user experience. And no, I don’t want to use my MUA to work with git, because magit is a much better, much more powerful tool for that, and I value my productivity.

                                                                                                      1. 1

                                                                                                        I’m not questioning you can build something very similar, but as long as e-mail is the only driving power behind it, there will be plenty of people who will turn to some other tool.

                                                                                                        I’m pretty sure the whole point would be that the “shiny UI” tool would not expose email to the user at all – so the “plenty of people” wouldn’t leave because they wouldn’t know the difference.

                                                                                                        1. 0

                                                                                                          So…. pretty much GitHub/GitLab/Gitea 2.0, but with the added ability to open PRs by email (to cater to that workflow), and a much less reliable foundation?

                                                                                                          Sure. What could possibly go wrong.

                                                                                          2. 1

                                                                                            I don’t think you can count signing up for GitHub if you’re not counting signing up for email.

                                                                                            If you’re using hub, it’s just hub pull-request. No context switching

                                                                                            1. 2

                                                                                              If you’re counting signing up for email you have to count that for GitHub, too, since they require an email address to sign up with.

                                                                                          3. 1

                                                                                            Using GitHub requires pushing to different repository and then opening the PR on the GitHub Interface, which is a context change. The git-send-mail would be equivalent to sending the PR.

                                                                                            git-send-email is only one step, akin to opening the PR, no need to push to a remote repository. And from the comfort of your development environment. (Emacs in my case)

                                                                                      1. 2

                                                                                        Haskell’s QuickCheck is my test framework of choice. There’s a really nice library called LazySmallCheck2012 which I’ve used a few times, but it seems mostly unmaintained (I patched it to work with GHC 7.10, but the pull request has sat unmerged for years). Maybe some idea in that direction would be useful?

                                                                                        Let’s say we need to generate some data of type (Bool, Either String Colour). With QuickCheck and SmallCheck we would choose a value (at random or systematically, respectively), like (True, Right Blue), and pass this to a test.

                                                                                        The idea of LSC and LSC2012 is that we only make choices when we have to. We do this by effectively running our test on the value undefined. If the test passes, we know it will work for all possible inputs. If it fails, we’ve found a (minimal) counterexample. If it throws an exception, it must have tried branching on the value we gave it; in which case we re-run the test on all those values which make one more choice than last time, e.g. (undefined, undefined). If we get another exception, say from the second element, we would re-run with both (undefined, Left undefined) and (undefined, Right undefined). We keep doing this, adding specificity if there’s an exception, until either the test passes, we find a counterexample, or we reach some depth limit.

                                                                                        The effect seems to be rather like using logic programming.

                                                                                        1. 5

                                                                                          What would you like to have improved in generative testing in general?

                                                                                          • Smart ways to generate recursive data structures with low risk of them blowing up exponentially as the size parameter increases

                                                                                          • Smart ways to direct generator distributions to problematic inputs.

                                                                                          1. 3

                                                                                            Smart ways to generate recursive data structures with low risk of them blowing up exponentially as the size parameter increases

                                                                                            Most automated implementations suffer from this (e.g. generic deriving libraries in Haskell). It seems to rely on the expected number of recursive calls being made: for example, if a binary tree generator has a 50/50 chance of picking a leaf or a node, the leaf has no recursive calls and the node has two, we can expect 0.5 * 0 + 0.5 * 2 = 1 recursive call; and hence the expected size of our data is unbounded. When there can be many sub-expressions, like in a list, this number can grow really large.

                                                                                            The naive way to tackle this is to adjust the probabilities, such that leaf is chosen more often and the expected recursive calls are < 1. Unfortunately this causes exponential decay in the amount of data generated; so we may never see values more than a few levels deep.

                                                                                            I tend to avoid this by passing a “fuel” parameter through the generator. This is conserved, so if we want to generate multiple pieces of data (e.g. elements in a list) we must divide it up. The original QuickCheck paper mentions this, but says it’s undesirable since it couples together different parts of the generated data (if some values are large, the others will be small).

                                                                                            There are some smarter approaches too although I’ve not used them.