1. 5

      You might enjoy this later post which is more explicitly about the problems of DRY and abstractions

      https://programmingisterrible.com/post/176657481103/repeat-yourself-do-more-than-one-thing-and

      1. 2

        Indeed, I’d forgotten about that one.

    1. 13

      Sometimes I read something like this and think ‘well yeah, obviously nobody is actually saying to take any advice they give to the most extreme possible point, use your judgement’. But then I remember all the code I’ve read (and this seems most common in Ruby for some reason) where people have literally factored out every single function until they’re almost all exactly 1 line long. And the code where they have written functions with four boolean arguments, used in half a dozen places with two combinations of boolean parameters. And the code that’s been hacked and hacked and hacked and hacked together to form a 5000-line shell script when they could have achieved the same result with a few hours and 200 lines of Python or something.

      The traditional UNIX command line is a showcase of small components that do exactly one function, and it can be a challenge to discover which one you need and in which way to hold it to get the job done. Piping things into awk ‘{print $2}’ is almost a rite of passage.

      I find this an interesting example if only because I think the Unix command line is a good example of how to do it right, because even if you don’t remember the command to use you can always just emulate most of the other commands with awk. And the general style leads to some really lovely software like gvpr, which I discovered yesterday.

      1. 8

        Sometimes I read something like this and think ‘well yeah, obviously nobody is actually saying to take any advice they give to the most extreme possible point, use your judgement.

        In other words, This means you’re not the audience—this is really aimed at those building the intuitions.

        As you explain, the problem is that we don’t often show good judgement. It’s only after knowing the consequences that we tend to take action. From beginners, i’ve often been asked how and when and where to apply things. The problem is, it’s contextual, and I was hoping to try and give that context.

        Rather than examining things through re-use, I wanted them to think about coupling. Instead of thinking about modules as collecting like features, as keeping them apart, and the whole ‘rewrites means migrations’ thing too.

        I find this an interesting example if only because I think the Unix command line is a good example of how to do it right

        Yes, and no. I mean, I thought the UNIX philosophy was a good idea until I realised how much git demonstrates it. Using flat files, small commands bolted together, fast c parts tied together with bash. It even has the unix thing where each file format or pipe output ends up being a unique mini language inside the program, too. It’s still awful to use.

        It’s a good way to build an environment but, well, every command takes slightly different arguments, and things like autocomplete don’t come from inspection or understanding the protocol, and we’re still emulating vt100 terminals. There are good ideas but UNIX demonstrates their discovery more than their application.

        On the other hand, plan9 demonstrates them quite well, and some of the problems too. It’s still not exactly pleasant to use, although wonderfully extensible. Plan9 leverages a consistent interface in more ways than UNIX did, exposing every service as a filesystem.

        The notion of a uniform interface is also seen in HTTP, and for what it’s worth, how clients on plan9 move from one file to another is very reminiscent of following hypertext in a browser. There are good ideas in UNIX, but there are better examples of them.

        Awk isn’t one of them, I mean, Awk’s great but it is one of the things, like tcl and bash, and perl that marked the end of ‘do one thing and do it well’, they were glue languages that grew features. Even bash 4 has associative arrays now.

        UNIX has grep and egrep and ripgrep and at least three distinct types of regular expressions in common use. UNIX has a thousand different command line formats and application directory layouts. UNIX gave us autoconf.

        I mean UNIX is great and all but we kept hacking shit on

        1. 6

          In other words, This means you’re not the audience—this is really aimed at those building the intuitions.

          What I meant is that my first reaction is ‘pointless article’, but that reaction is wrong! I think the article is good and necessary. More like it are necessary.

          Yes, and no. I mean, I thought the UNIX philosophy was a good idea until I realised how much git demonstrates it. Using flat files, small commands bolted together, fast c parts tied together with bash. It even has the unix thing where each file format or pipe output ends up being a unique mini language inside the program, too. It’s still awful to use.

          What? Git is not awful to use, it’s fantastic for all those reasons you just gave. You can dig into the internals of it without having to read any C. You pipe together those files into different formats yourself using a combination of standard utilities and git-x-y-z plumbing commands. What’s awful about that?

          I have a much harder time ever getting anything to work in Mercurial, to be honest. Every time I try to use Mercurial it’s just the same as git except some of the commands have slightly more sensible names, everything is incredibly sluggish and lots of features just don’t exist or only exist if you turn on a million extensions.

          And then once you have those extensions enabled, it’s just as confusing and inconsistent as git. Go look at the.. is it called queues? Something like that, I’ve forgotten. It’s necessary to get a lot what comes in git by default, and it’s way overcomplicated.

          It’s a good way to build an environment but, well, every command takes slightly different arguments, and things like autocomplete don’t come from inspection or understanding the protocol, and we’re still emulating vt100 terminals. There are good ideas but UNIX demonstrates their discovery more than their application.

          Of course different commands take different arguments, they do different things and have different purposes. Why would they all be the same? There’s nothing stopping you going and writing a patch for scp that lets it take -R to mean -r, something I always mistype the first time being used to other commands. I doubt they’d reject the patch.

          Everything accepts --help and man pages exist.

          The state of terminals is a rather different question. It’s just one of those things where it’s a bit of a local maximum. Trying to move to something that isn’t VT100 terminal emulation would require an enormous amount of effort for a relatively small benefit. Emulating VT100 terminals doesn’t really hurt except for a few little things like ctrl-i and tab being the same thing, but in some scenarios that’s what you want, some people want to be able to tab-complete with ctrl-i. But it really has nothing to do with the Unix philosophy anyway.

          Autocomplete, well, you could define a format for --usage that is machine-parseable and defines the format for commands. Whenever you do x -o [tab] it calls MACHINE_READABLE_USAGE_OUTPUT=1 x --usage and then parses that result to see that -o is followed by a file, etc. etc. etc. Any other protocol you like. Maybe man pages could have an additional USAGE section with a machine-readable grammar for their usage. Getting shells to all agree on one particular way of doing things is the issue, not the ability to do something like that within the Unix command line model.

          The whole idea of commands in a command line is arguably what it means to have ‘the unix command line’, given that they can be piped together and that they input and output text.

          On the other hand, plan9 demonstrates them quite well, and some of the problems too. It’s still not exactly pleasant to use, although wonderfully extensible. Plan9 leverages a consistent interface in more ways than UNIX did, exposing every service as a filesystem.

          I really don’t think that ‘everything is a file and every service is a filesystem’ is the right way to view the Unix philosophy. Plan9 doesn’t feel like the ultimate culmination of Unix to me. It feels like… I don’t want to be rude about it, I don’t mean this in a rude way, but it feels like a caricature of the Unix philosophy.

          The Unix philosophy is implementing things in a standardised and accessible way so that you can use a general suite of tools to handle different things. It doesn’t have to be text, it’s just that it should be text if I can reasonably be text. ffmpeg still feels like a Unix command to me.

          The thing that feels least-Unixy to me is audio on my system. Audio should definitely be done differently from how it is. I feel like I have almost no control over it. I want to be able to say ‘take the audio from here and put it into here then merge those audio streams and copy this one to this output then with the new copied output mix the channels to mono’ etc. etc. And not using some arcane GUI.

          Awk isn’t one of them, I mean, Awk’s great but it is one of the things, like tcl and bash, and perl that marked the end of ‘do one thing and do it well’, they were glue languages that grew features. Even bash 4 has associative arrays now.

          There’s a rule I have that in any system there will always be something complicated. It’s kind of broad, but look at any categorisation, any set of rules, any set of tools, there will always be a ‘misc’. It might be quite hidden or it might be just simply labelled ‘miscellaneous’. In any set of tools there’s always a tool that you use when all the other tools won’t work in all those random little situations that the others don’t fit. In any categorisation of anything, there’ll always be a few objects being categorised that just don’t fit into your neat hierarchy and need to be put into ‘other’.

          Unix command line is no different. You have all the little useful tools and then you have awk because sometimes you just have to do something complicated. I mean that’s the reality, right? Sometimes you have to do something complicated.

          UNIX has grep and egrep and ripgrep and at least three distinct types of regular expressions in common use. UNIX has a thousand different command line formats and application directory layouts.

          ‘There should be one — and preferably only one — obvious way to do it’ is the Python motto, not the Unix philosophy.

          Unix has grep and egrep and ripgrep, sure. grep is the traditional Unix tool, egrep is an alias for grep -E using extended regular expressions. I assume these are even less actually-regular than grep’s regular regular expressions and thus slower. ripgrep is a modern reimplementation of grep in Rust that (as far as I know) only supports true regular expressions and is very fast as a result.

          A better comparison would be between Perl-style and POSIX-style regular expressions, but these are actually really completely different things. You might even get away with arguing that one is really imperative and the other is really declarative. They both have good reasons to exist, there are definitely reasons to prefer either, they coexist and I think that’s a good thing.

          There are many different command line formats? Not sure what that really means. Virtually everything today uses - before short commands, allows short commands to be written like -xcvf instead of -x -c -v -f, and supports --long-arguments. Yeah there are a few older commands like ps that support lots of formats in one commands, but that’s just backwards compatibility. The only systems that don’t have a few ugly corners for backwards compatibility are new ones that nobody has used enough yet. The only way to avoid them is to just throw out everything more than a year or two old. Please don’t turn Unix into front end web development.

          Application directory layouts? No idea what that means sorry.

          UNIX gave us autoconf.

          autoconf is to many other build systems as GPL is to BSD licenses. Is it a pain for developers? Yeah, absolutely. But it’s not designed to be easy for developers. It’s designed so that you can give a tarball to a user and they just type ./configure [possibly some arguments]; make; make install. Just as GPL is designed to be friendly for end-users, while BSD is designed to be friendly for developers, autoconf is designed to be essentially invisible to end users. I don’t have to install cmake and deal with CMakeFiles.txt and other annoying crap when I just want to run ./configure && make && sudo make install.

          And remember autoconf was not designed for people to compile a simple bit of C software onto one or two Linux distributions as it’s often used today, but to work around the inconsistencies and incompatibilities of dozens of different Unix operating systems. Today it’s unnecessary more than it’s bad. You really need about a 15 line Makefile to make all but the most complex C programmes, which you write once and never touch again. In those 15 lines you can quite easily and readably scan the included headers of each file to generate the dependencies between compilation units and handle all that stuff very easily.

          Most of the problems people have with autoconf come from copy-pasting existing configurations and blindly hacking at them with absolutely no understanding of what is actually going on whatsoever. There are configuration switches in some programmes that haven’t been relevant since before the person who wrote them was born.

          1. 8

            There is so so much to unpack here but frankly it seems like a pointless conversation

            What? Git is not awful to use,

            The command line tool that has markov chained manual pages as satire. The command line tool where the primary interface is stack overflow.

            I’m really not sure we’ve used the same tool.

            Plan9 doesn’t feel like the ultimate culmination of Unix to me.

            Tell that to the UNIX authors, who wrote it.

            egrep is an alias for grep -E using extended regular expressions

            Other way around, buddy. GNU grep built stuff in. Unix, the system with a command line program called [

            Also I think you’re also confusing the gnu userland with unix which went against most of the unix design ideas at the time.

      1. 4

        The advice in here is not bad. But, IMO, a mistake I see fellow developers do often is that they think every thing they write has to tak this entire approach. So rather than seeing their ability to write code grow, such that they can write a meaningful library the first time rather than not, they reset at each new peice of code to the “copy and paste”. That makes it really hard for them to reach escape velocity and write good code. The outline in here should be for an entire career. At this point, unless it’s something wildly new, I can write a good, reusable, library off the top of my head the first time around.

        1. 5

          Hi

          think every thing they write has to tak this entire approach.

          I tried to capture this dissonance. Each step contradicts the next. I tried to repeat the point “you won’t know what you need until you already need it” throughout the steps.

          The outline in here should be for an entire career.

          That was also my intent. Along with each step contradicting the next, each step was further along the lifecycle of a project. I admit I didn’t try to make this too clear.

          1. 2

            I must admit, to really have a good escape velocity you have to really take your time and think through your data structures. If your data structures suck, the code will come to a standstill sooner or later (or explode in bloat). In these cases, it even makes sense to rewrite the program (and copy-paste the useful utility functions), so it allows deleting code portions again that were introduced in the first place to circumvent weaknesses in the data structure. These hacks I often find to be the reason for bugs in the first place as well.

            1. 1

              Can you elaborate on what you mean exactly? The way I approach a problem is to write a clear, pure (not in the functional sense), API tha solves the problem. It hides all underlying data structures. For most things in a program, one can use the most inefficient data structure ever and it doesn’t really matter, so when it comes to really hot things I can generally refactor the inside of the API without effecting the API, to be more efficient. And in those worst cases, I have to tear down a few walls for efficiency. In otherwords, I find the actual choice of data structure mostly a non-question until some performance isuse has demonstrated it needs to be an issue.

              I’ve found the above very effective in every project I’ve worked on. Are you suggesting a different approach or does that align with what you’re saying as well?

              1. 1

                I’m mostly a C developer, so it all boils down to giving the user of a library easy and simple structs to work with and which are also interfaced via functions. The API is thus also part of the data structure, and both approaches (hiding or opening) the data structure to the user are good approaches in my view. Do what suits you best and most importantly: What suits the problem.

            2. 1

              At this point, unless it’s something wildly new, I can write a good, reusable, library off the top of my head the first time around.

              But should you do so for that script to load the barfoo data into the baz database one time before we build an interface to manually edit the data?

              I would say that cut and paste would work just fine for that use case and that a good reusable library is overkill.

              1. 2

                Cut and paste what? With th reusable components I have, I can generally just combine some existing components together pretty trivially to get the new behaviour. The case that I think you are referring to is much rarer, IME, than many people like to claim. Not never, but rare enough it’s not a reasonable default, IMO.

            1. 1

              (This came about from having to deal with the ‘never rewrite’ schtick spolsky used which was unfortunately interpreted as dogma)

              1. 1

                Yeah, my favorite part of this article is the examination of how an anecdote and a hundred thousand pageviews turns into gospel truth.

              1. 1

                As a postscript, one thing I forgot to mention was that the Sutherland in the paper was Ivan Sutherland

                http://en.wikipedia.org/wiki/Ivan_Sutherland

                1. 3

                  Day job writing a distributed web crawler. We’re getting to the point where reading the dynamo paper is actually going to be useful. I worked for a while on https://github.com/hyperglyph/hyperglyph which is a rpc library that uses hypermedia.

                  Recently I was trying to hack up a loop in youtube videos. http://secretvolcanobase.org/~tef/loop.html?videoid=jKTEXhc6IyY&start=8&stop=11.5 It’s harder than it looks. And it doesn’t work well enough yet.

                  1. 2

                    Distributed web crawler…what kind of stuff are you crawling? General search or something specific?

                    1. 1

                      Depends. Some people want their content captured, other people want a wider scope. We can do filtering or direction but it requires more manual intervention than I’d like.

                    1. 2

                      neat! supercollider is awesome fun.

                      i recently used supercollider + DarwiinRemoteOSC to hook up wiimotes and the wii fit board to make terrible wobbly sounds and control sound loops with my friend.

                      supercollider is a very quirky smalltalk, and I’ll admit I wouldn’t have gotten anywhere without help. My friend has a lot more experience in the quirks and catches, and between us we managed to get the peripherals working.

                      he later went on to make a wii theremin that could play the doctor who theme tune.

                    1. 1

                      Ruby has quite an ornate grammar, so the first thing I tried broke in opal. I opened an issue.

                      https://github.com/opal/opal/issues/137

                      Looking through the issue tracker there are a number of other ruby features which are unimplemented/implemented poorly. Like blocks: https://github.com/opal/opal/issues/130

                      It’s a nice idea, but they have a long slog ahead of them to implement ruby.

                      1. 1

                        If you try to reimplement the Ruby grammar yourself, you’re gonna have a bad time.

                        This is still a neat project, even though it will probably only be a toy.

                        1. 2

                          There are two known good ways to re-implement the ruby grammar

                          • cargo cult what’s in parse.y, EXPR_MID and all. (There seems to be a bunch of unused lexer states that appear in every attempt).

                          • use a generalized parser and disambiguate the parse tree later.

                          The ruby grammar is somewhat ambiguous, so either you have to disambiguate in the lexer, or the parser. The lexer route is the parse.y technique – using the symbol table, and having the parser change the lexer state to force it to return specific tokens (DO_COND/DO_LAMBDA, etc). The parser route is cleaner, but comes with the cost of GLR parsing.

                          The ruby grammar is somewhat ambiguous, so either you have to disambiguate in the lexer, or the parser.

                          1. 1

                            Well, if it ever gets beyond the toy stage, it would be fun to build a Ruby VM by cross compiling it to JS and evaluating it with V8. It would provide a third Ruby VM that actually compiles to machine code (Rubinius and JRuby being the only others I know of).

                            1. 1

                              I know there’s at least one MRI bytecode vm written in JS, but I can’t think of the link now

                              1. 1

                                coldruby looks like what you’re thinking of. From the readme it seems it runs YARV bytecode in a JavaScript runtime. Although I don’t see it mentioned, there’s an older project hotruby that’s got to be the inspiration for coldruby. It might be fun to flesh out @tenderlove’s Scheme to YARV bytecode compiler to mix ruby, scheme, and js on coldruby (although that’s missing the point/coolest part of his gist (omg RubyVM::InstructionSequence.load with fiddle!))

                        1. 5

                          I have been on the other end of it, but more “Jerk” and definitely not “Brilliant”. As much as I find some of the article objectionable, I agree with the advice.

                          To be equally asinine as the article:

                          What should you do with people you treated like founders but now want to demote, as you hire more employees? Fire them. It’s quicker than doing it piecemeal."

                          It’s a bit unfair to label people who invest time in the early part of the company as jerks, because they don’t agree with the direction that the company is going in. It isn’t often loyalty from the management side that compels entrepreneurs to retain these brilliant jerks, it’s the fact they’ve been working hard for a comparatively low salary.

                          I also don’t think it’s fair to characterise the movies of said jerk as ‘maintaining their glory’, when often the desire to grow the company could be similarly labeled as greed. It’s simply name-calling in lieu of empathy.

                          Underlying the article is the denial of a demotion – it isn’t the change in size, it’s the change in hierarchy that makes the difference. You begin on the same level as the founders, but as the company grows, they rise, and you sink to the bottom.

                          1. 1

                            There’s some discussion in the comments about a content addressed url scheme. href=“sha256://abcdef1234”

                            This would combine neatly with the ability to have more than one href property. You could specify the hash, but more than one hash would also mean you could specify a mirror or backup server.

                            <script src=“sha256://xyz” src=“site1.example.com/foo.js” src=“backup.example.com/foo.js”>

                            1. 2

                              feels like you’re reinventing magnet links though – you don’t need a new url type for content addressed stuff.

                              as you’ve touched on. http is essentially a location centric protocol, rather than a content-centric one. moving to fingerprints over locations suggests that http isn’t the right way to go about it.

                              your suggestion is conceptually simple – it’s an easy change in the format, but possibly a hard change in the tooling (repeating src=“…” attributes would necessitate changes to the dom api too).

                              1. 5

                                This talk is fantastic: http://vimeo.com/20781278

                                This model is pretty good too: http://martinfowler.com/articles/richardsonMaturityModel.html

                                1. 3

                                  :+1: to both of these, for sure.

                              1. 3

                                something for esoteric or non language specific things would be nice – for weird languages and language design issues.

                                perhaps allowing software.* style tags might help.

                                1. 2

                                  There is a “compsci” tag that should encompass language design. For articles about specific languages that don’t have tags, maybe flaviusb’s “otherprog” or whatever it becomes will work for those.

                                  1. 1

                                    I think areas of Computer Science like programming languages (especially language design) are broad enough to be considered outside of the scope of software engineering, so I would support an entirely separate “programming languages” tag, but I think the software. style tags would be nice, like software.agile or software.tdd. Then, if you followed software, you would also follow everything in software.