1. 5

    I’m pretty sure the first one is wrong. I thought the spec said explicitly that evaluating sizeof may not have side effects. This construct seems to be accepted by GCC and Clang but I think it’s actually UB. In C++ decltype may not have side effects and I believe this follows from the same rule.

    VLA typedefs are useful for the same reason that VLAs are useful. If you are using a VLA, it’s useful to be able to name the type.

    Array designators were a GNU extension but were added to C in C11, I think. They’re really useful and it’s a shame that they’re not in C++.

    The preprocessor is a functional language and you can pass a macro to another macro, but evaluation may not happen when you think it does. The most common case where this bites people is in stringify macros.

    The switch thing comes with the same caveat as normal switch: It’s basically goto and you must be really careful not to branch over variable initialisation. I can’t remember if C rejects constructs that do or just makes them UB. It also doesn’t generate better code with a modern compiler than writing it the sane way - it’s just basic blocks and branches at the IR layer.

    The a[b] thing is actually even more fun. Addition in C is commutative, so a+b is equivalent to b+a in all contexts. Pointer + integer is defined as address + (size of pointee) * integer and because addition is commutative this doesn’t matter if you put the pointer on the left or right side. The same rule is why integer promotion works in C. If I were trying to design a language feature that looked sensible at first glance but would lead to many bugs, this is probably about as good as anything I could come up with.

    1. 3

      I thought the spec said explicitly that evaluating sizeof may not have side effects.

      Here’s what it says in section 6.5.3.4 paragragh 2 (about the sizeof and _Alignof operators) in my copy of the C11 spec:

      If the type of the operand is a variable length array type, the operand is evaluated;

      Then in section 6.7.6.2 paragraph 5 (Array declarators):

      If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero.

      I can’t find any mention of side-effects in array declarator expressions.

      1.  

        Array designators were a GNU extension but were added to C in C11, I think. They’re really useful and it’s a shame that they’re not in C++.

        They were introduced in C99 at the same time as member designators (https://port70.net/~nsz/c/c99/n1256.html#6.7.8p6). The obsolete GNU syntax for designators is x: 123 for members and [10] "foo" for indices.

        I am surprised that they aren’t supported in C++, I wonder why that is. Perhaps they conflict with some other C++ syntax?

        1.  

          As of C++11, initialiser lists are an object type and can be forwarded and so on. The array indexing extension doesn’t play very nicely with this because it’s special constructor syntax of a single kind of object. It also doesn’t play so nicely with type deduction.

        2. 1

          I’m pretty sure the first one is wrong.

          I think so too, but mainly as a bit of semantic hair-splitting. Technically, it’s the type that is causing the side effect, not sizeof.

          1.  

            Damn, I was about to ask whether int b[static 12][*][*] is valid C++.

            I realize you can use std::array to express this in C++, but a lot of my work involves public C APIs around C++ implementations, so making those APIs safer and more expressive would be great.

          1. 1

            Under the General Guidelines:

            Don’t use self-modifying code. Processors with cache ram cannot handle self-modifying code at all well.

            Even by the 90s I suspect self-modifying code was frowned upon, but not for caching reasons.

            1. 4

              This is essentially a press release and off-topic. Please wait and post something that is actually a post-mortem.

              1. 6

                It’s from the engineering blog, I guess as a good example of how engineering blogs are just branches of PR departments.

                1. 1

                  It’s Facebook, not Cloudflair. I doubt we’ll ever get a real breakdown of what happened.

              1. 1

                There was a talk at this year’s Linux Plumbers Conference about something related to this: turning on implicit function declarations as an error. This causes massive problems in many packages because autoconf relies on them; however, setting it as an error doesn’t cause a configuration error so you end up building the code with a feature disabled and it’s not obvious that’s what happened.

                Making this an error by default simply won’t happen without massive amounts of work to autoconf and all the code that depends on it. The proposal was to patch the compiler to produce a special log file and that is checked for problems at the end of the build.

                1. 10

                  Code may change and the corresponding comment often remains as it was. Except now the comment is misleading. This makes maintaining the code unnecessarily difficult.

                  If you’re not adjusting comments in your code when you’re changing the code, then you aren’t maintaining the code.

                  1. 5

                    The same kind of programmer who doesn’t update comments also will not rename the function when he changes its behaviour.

                    1. 1

                      That’s a fair point. If the comment is there, then the programmer should keep it accurate or remove it if it’s not necessary.

                    1. 8

                      For those wondering what “practice” means according to Mike Acton, see the linked talk starting around 13:00.

                      Based on what Mike Acton says, no, I don’t practice and I don’t think I ever have. Nor do I think it is necessary. It doesn’t sound like a bad thing, mind you. That said, what he describes as practice is the kind of thing I do when solving problems or just doing my daily work.

                      1. 10

                        Once we move beyond one-liners, a natural question is why. As in ‘Why not use Python? Isn’t it good at this type of thing?’

                        The reasons provided are fine, but for me the main reason is speed. AWK is much, much faster than Python for “line at a time” processing. When you have large files, the difference becomes clear. (perl -p can be a reasonable substitute.)

                        Once you are writing long AWK programs, though, it’s time to consider Python or something else. AWK isn’t very fun once data manipulation gets complicated.

                        1. 4

                          (perl -p can be a reasonable substitute.)

                          +1. In my eyes, it’s Awk and then Perl. Perl turns out to be much better for these purposes than other scripting languages. The difference in startup time between Perl and Python is very significant. If you don’t use (m)any modules, Perl scripts usually start just as quickly as Awk scripts.

                          1. 2

                            I’m sure that’s true for some kinds of scripts, but that doesn’t match my experience/benchmarks here (Python is somewhat faster than AWK for this case of counting unique words). For what programs did you find AWK “much, much faster”? I can imagine very small datasets being faster in AWK because it’s startup time is 3ms compared to Python’s 20ms.

                            1. 2

                              For what programs did you find AWK “much, much faster”?

                              Any time the input file is big. As in hundreds of MGs big.

                              I used to have to process 2GB+ of CSV on a regular basis and the AWK version was easily 5x faster than the Python version.

                              1. 1

                                Was the Python version streaming, or did it read the whole file in at once?

                                1. 1

                                  Streaming.

                              2. 2

                                Regarding your results, 3.55 under awk is with or without -b?

                                I get 1.774s (simple) and 1.136s (optimized) for Python. For simple awk, I get 2.552s (without -b) 1.537s (with -b). For optimized, I get 2.091s and 1.435s respectively. I’m using gawk here, mawk is of course faster.

                                Also, I’ve noticed that awk does poorly when there are large number of dictionary keys. If you are doing field based decisions, awk is likely to be much faster. I tried printing first field of each line (removed empty lines from your test file since line.split()[0] gives error for empty lines). I got 0.583s for Python compared to 0.176s (without -b) and 0.158s (with -b)

                                1. 4

                                  Also, I’ve noticed that awk does poorly when there are large number of dictionary keys.

                                  Same here. If you are making extensive use of arrays, then AWK may not be the best tool.

                              3. 2

                                Once you are writing long AWK programs, though, it’s time to consider Python or something else. AWK isn’t very fun once data manipulation gets complicated.

                                I dunno, I think it’s pretty fun.

                                I am consistently surprised that there aren’t more tools that support AWK-style “record oriented programming” (since a record need not be a line, if you change the record separator). I found this for Go, but that’s about it. This style of data interpretation comes up pretty often in my experience. I feel like as great as AWK is, we could do better - for example, what about something like AWK that can read directly from CSV (with proper support for quoting), assigning each row to a record, and perhaps with more natural support for headers.

                                1. 2

                                  You are right. Recently I was mixing AWK and Python in a way that AWK was producing key,value output easily readable and processed later by Python script. Nice, simple and quick to develop.

                                1. 7

                                  I wonder whether we should give up nested folders and just move to tagging.

                                  1. 6

                                    I tried this for a while and it suffers the same problem as nested folders: you still have to tag/categorize everything.

                                    1. 6

                                      For things that have no better location, I use a system of weekly folder rotation which works out pretty well since everything current is there and you don’t need to check a lot in the older folders usually.

                                      Everything that has a better location (e.g. because it’s part of a project) gets moved to that then.

                                      1. 1

                                        Yeah, it just seems like it is more flexible. Yes, tagging can be a pain and there is no notion of one categorization being a sub of another. That part is not easily discoverable. Those are two downsides.

                                        1. 2

                                          I do think tagging is better, by the way. When I tried it, though, I found I was very inconsistent with what tags I was using so finding that “thing that was like some other thing” was not as great as was made out to be.

                                      2. 3

                                        A path is just a list of tags, especially if you have a super fast search engine like Everything.

                                        I place my files in nested folders, but I don’t navigate them. I open Everything, type parts of the path, and it’s in the top 3 of “Date accessed” 99% of the time. Takes one second.

                                      1. 10

                                        Working with befuddled students has convinced Garland that the “laundry basket” may be a superior model. She’s begun to see the limitations of directory structure in her personal life; she uses her computer’s search function to find her schedules and documents when she’s lost them in her stack of directories. “I’m like, huh … I don’t even need these subfolders,” she says.

                                        This feels like a case of the pendulum swinging too far the other way. If you’ve ever tried to keep everything carefully organized you’ve probably found it to be a chore. That’s because it is for the vast majority of people. I’m a “files and directory” kind of person and I mostly toss things in a “downloads” folder until I feel the need to sift through it a bit and build up the patience to bother doing so. Search, even basic search, is really useful. Directory hierarchies are useful. Neither is all that superior to the other all the time. (Mind you, a good filename goes a long way. Sometimes just renaming it is good enough.)

                                        The article seems to be deliberately playing the extremes off each other for some shock value. I know various high school teachers who teach in an essentially “paperless” classroom and the students are familiar with, and use, a hierarchical filesystem, even if that’s not what they would call it. Would they know about a “file on the Desktop”? Maybe not, but grasping the concept wouldn’t be totally alien. Folders of documents in OneDrive and their electronic classroom setup are quite similar.

                                        1. 11

                                          I know various high school teachers who teach in an essentially “paperless” classroom and the students are familiar with, and use, a hierarchical filesystem, even if that’s not what they would call it

                                          When I was working on Étoilé, I came across a paper that showed that around 10-20% of people find hierarchies a natural form of organisation. It came out at about the time iTunes was getting awful reviews from geeks and stellar reviews from everyone else. iTunes (prior to iTunes 5, which was the turning point where Apple decided that they hated their users) didn’t use a hierarchy at all for organisation. It used structured metadata, but allowed arbitrary filters, rather than the traditional {genre}/{artist}/{album}/{track} filing that most filesystem-based players used. This was a lot more flexible (what happens if I want to list all ‘60s music? Trivial with iTunes’ filter model, difficult with a hierarchy if decade is not the top layer in the hierarchy) and was popular with most users.

                                          I’ve wondered for a long time about the self-selection that we get from the fact that most programming languages are strongly hierarchical (imposing hierarchy was, after all, the goal of structured programming). This leads to programmers being primarily selected from the set of people who think in terms of hierarchy (the percentage of people who think in terms of hierarchy and the percentage that find it easy to learn to program seem to be sufficiently similar numbers that I wouldn’t be at all surprised if they’re the same set). This, in turn, leads to programmers thinking that hierarchical organisational structures are natural and designing UIs that are difficult for non-programmers to learn.

                                          With Étoilé, we wanted to remove files and folders as UI abstractions and provide documents and tags, with a rich search functionality. That can still be mapped into a filesystem abstraction for the low-level interface but it didn’t need to be the abstraction presented to users.

                                          1. 2

                                            …we wanted to remove files and folders as UI abstractions and provide documents and tags, with a rich search functionality.

                                            For the record, I think this is a superior approach. Ultimately, the problem is that actually tagging or categorizing your own data is a chore, so the results are somewhat lacklustre. (Music and photos are kind of an exception here because they often come from a source that provides the metadata that makes a structured search useful.)

                                        1. 2

                                          […] nothing in software engineering makes sense except in light of human psychology; the more attention researchers and developers pay to that, the faster our profession will make real progress.

                                          GCC and Clang have gotten a lot better at providing useful error messages in the last few years. They seem to have taken this (good) advice to heart.

                                          1. 5

                                            Dragonfly has gone a long way since; now they’re trading blows with Linux in the performance front, despite the tiny team, particularly when contrasting it with Linux’s huge developer base and massive corporate funding.

                                            This is no coincidence; it has to do with SMP leveraged through concurrent lockfree/lockless servers instead of filling the kernel with locks.

                                            1. 3

                                              This comparison, which seems pretty reasonable, makes it look like it’s still lagging behind.

                                              1. 7

                                                What I don’t like about Phoronix benchmark results generally is that they lack depth. It’s all very well to report MP3 encoding test running for 32 seconds on FreeBSD/DragonflyBSD and only 7 seconds on Ubuntu, but that raises a heck of a question: why is there such a huge difference for a CPU-bound test?

                                                Seems quite possible that the Ubuntu build is using specialised assembly, or something like that, which the *BSD builds don’t activate for some reason (possibly even because there’s an overly restrictive #ifdef in the source code). Without looking into the reason for these results, it’s not really a fair comparison, in my view.

                                                1. 3

                                                  Yes. This is well worth a read.

                                                  Phoronix has no rigour; it’s a popular website. A benchmark is useless if it is not explained and defended. I have no doubt that the benchmarks run in TLA were slower under freebsd and dragonflybsd, but it is impossible to make anything of that if we do not know:

                                                  1. Why

                                                  2. What is the broader significance

                                                  1. 4

                                                    The previous two comments are fair, but at the end of the day it doesn’t really change that LAME will run a lot slower on your DragonflyBSD installation than it does on your Linux installation.

                                                    I don’t think these benchmarks are useless, but they are limited: they show what you can roughly expect in the standard stock installation, which is what the overwhelming majority of people – including technical people – use. This is not a “full” benchmark, but it’s not a useless benchmark either, not for users of these systems anyway. Maybe there is a way to squeeze more performance out of LAME and such, but who is going to look at that unless they’re running some specialised service? I wouldn’t.

                                                2. 1

                                                  This comparison, newer and from the same website, makes it look as the system that’s ahead (see geometric mean @ last page).

                                                  Not that I’m a fan of that site’s benchmarks.

                                                  1. 2

                                                    I haven’t done the math, but it seems like most of DragonFlyBSD’s results come from the 3 “Stress-NG” benchmarks, which incidentally measures “Bogo Ops/s”.

                                                    Here’s the benchmark page: https://openbenchmarking.org/test/pts/stress-ng

                                                    I don’t know why Phoronix uses a version called 0.11.07 when the latest on the page seems to be 1.4.0, but maybe that’s just a display issue.

                                                    1. 1

                                                      Christ @ benchmarking with Bogo anything.

                                              1. 10

                                                I’m not a huge fan of IDEs but this article is mostly nonsense.

                                                This in particular I take exception to:

                                                Less Code, Better Readability

                                                Not having Autocomplete / Code Generation guides you naturally to writing more compact and idiomatic Code in the Language of your Choice. It helps you to learn language-specific features and syntactic sugar. Consider

                                                System.out.println(“hello”);

                                                printf(“hello”);

                                                You see that in C, where using an IDE is uncommon, the languge/libraries/frameworks naturally becomes more easily readable and writable.

                                                I see this as a win for IDEs especially in large projects because the cost of avoiding useless abbreviations is minimal.

                                                ShoeApi.GetPriceAndFeatures();
                                                

                                                is way more readable and explicit than

                                                Api.GetPrcFeat();
                                                

                                                Maybe not the most realistic example but you get what I’m saying and we’ve all seen code like the latter and had no clue what it does without drilling into the method.

                                                Also since when does having autocomplete equal full-blown IDE?

                                                1. 4

                                                  The reason the C library and Unix functions have such short names is, in part, because the original C linker only looked at the first 8 characters of symbol names.

                                                  (Likewise, I seem to recall that the original Unix filesystem only allowed 8-character filenames, forcing shell commands to be terse. I may be wrong on this; but most filesystems of that era had really limited names, some only 6 characters, which is why the classic Adventure game was also called ADVENT.)

                                                  1. 3

                                                    I think you might be conflating Unix with MS-DOS. The early Unix file systems allowed 14 character names, while MS-DOS limited filenames of 8 (well, 8 characters for the name, plus 3 for the extension).

                                                    1. 8

                                                      I guess Ken could have spelt creat with an e after all.

                                                    2. 3

                                                      The reason the C library and Unix functions have such short names is, in part, because the original C linker only looked at the first 8 characters of symbol names.

                                                      This was actually part of the first C standard, and it was limited to “6 significant initial characters in an external identifier”. (Look for “2.2.4.1 Translation limits” here.)

                                                      This was almost certainly due to limitations from FORTRAN and PDP-11. PDP-11 assembly language only considered the first 6 characters. (See section 3.2.2, point 3.) FORTRAN (in)famously only used 6 characters to distinguish names. If you wanted interoperability with those systems, which were still dominate in the 80s, then writing everything to fit in 6 characters made sense.

                                                    3. 3
                                                      ShoeApi.GetPriceAndFeatures();
                                                      

                                                      is way more readable and explicit than

                                                      Api.GetPrcFeat();
                                                      

                                                      I see a few independent criteria these comparisons:

                                                      1. The inclusion of the namespace (System. / ShoeApi. / Api vs… not)
                                                      2. Low vs. high context names (ShoeApi vs. Api)
                                                      3. Abbreviated names GetPriceAndFeatures vs GetPrcFeat

                                                      I prefer to elide the namespace if it appears more than a couple times. (X; Y; Z; over HighContextName.X; HighContextName.Y; HighContextName.Z;)

                                                      I prefer high context names, especially when there’s already a namespace. (service.Shoes.Api over service.Shoes.ShoeApi)

                                                      I prefer to not abbreviate. (GetPriceAndFeatures over GetPrcFeat … though both aren’t great.)

                                                      Best (if I must use a Java-like): Api.Fields(Price, Features).Get()

                                                    1. 17

                                                      This is just “Java is a miserable nightmare to program in”, then it just descends into parroting the party line on Unix philosophy with nonsensical statements to back it up. “printf” vs. “System.out.println” is not a great reason.

                                                      1. 25

                                                        Yup. I’ve worked in large Java code bases, large Python code bases, and large C/C++ code bases. The reasons given in this article are imagined nonsense.

                                                        • Remember syntax, subroutines, library features: absolutely no evidence to support this. In all my years I have yet to see even a correlation between the use of IDEs and the inability to remember something about the language. Even if the IDE is helping you out, you should still be reading the code you write. (And if you don’t, then an IDE is not the problem.) This claim is a poor attempt at an insult.
                                                        • Get to know your way around a Project: If anything, IDEs make this simpler and better. I worked in GCC /binutils without tags or a language server for a while and let me just say that without them, finding the declaration or definition with grep is much less efficient.
                                                        • Avoid long IDE startup time / Achive better system performance: Most people I know who use IDEs shut them down or switch projects about once a week, if that. This is just whining.
                                                        • Less Code, Better Readability: Tell this to the GCC and Binutils developers, who almost certainly didn’t use an IDE yet still managed to produce reams of nearly unreadable code. Yet another nonsense claim from the “Unix machismo” way of thinking.

                                                        The other points made in the article are just complaints about Java and have nothing to do with an IDE.

                                                        1. 6

                                                          Avoid long IDE startup time / Achive better system performance: Most people I know who use IDEs shut them down or switch projects about once a week, if that. This is just whining.

                                                          It’s also not even true. In particular tmux and most terminals top out at a few MB/s throughput and stop responding to input when maxed so if you accidentally cat a huge file you might as well take a break. Vim seems to be O(n^5) in line length and drops to seconds per frame if you open a few MB of minified json, and neovim (i.e. vim but a DIY IDE) is noticeably slower at basically everything even before you start adding plugins. Nevermind that the thing actually slowing my PC down is the 5 web browsers we have to run now anyway.

                                                          1. 2

                                                            Vim seems to be O(n^5) in line length and drops to seconds per frame if you open a few MB of minified json

                                                            Obscenely long lines are not a very realistic use pattern. Minified JSON and JS is a rare exception. Vim uses a paging system that deals very well with large files as long as they have reasonable line sizes (this includes compressed/encrypted binary data, which will usually have a newline every 100-200 bytes). I just opened a 3gb binary archive in vim; it performed well and was responsive, and it used only about 10MB of memory.

                                                            1. 3

                                                              A modern $100 SSD can read a few MB in 1 millisecond, $100 of RAM can hold that file in memory 5 million times, and a $150 CPU can memcpy that file in ~100 nanoseconds.

                                                              If they did the absolute dumbest possible implementation, on a bad computer, it would be 4-5 orders of magnitude faster than it is now.

                                                            2. 2

                                                              Oh yes, I didn’t even mention this seemingly ignored fact. I can’t speak for Vim and friends, but Emacs chokes horribly on large files (much less so with M-x find-file-literally) and if there are really long lines, which are not as uncommon as you might think, then good luck to you.

                                                        1. 28

                                                          To prevent the problem from getting any worse, we have to stop normalizing CSV as an output or input format for data transfer.

                                                          I’m not sure the author is aware of how many legacy tools not only rely on CSV, but only know how to export CSV, and how many of the people who work with it are not technical.

                                                          When I worked at a web startup, one of our clients shipped us customer information as 2GB+ of CSV files every day as an FTP upload. They never compressed it or used SSH because it was too difficult for them. I did manage to at least get them to use the pilcrow character as a field separator so that the mistyped email addresses with commas in them didn’t screw up the import (which was a custom script with all kinds of fiddling to fix various entries). This file was exported from some Oracle database (I think–it was never clear) and the people uploading were a sales team with basically no technical skills.

                                                          Fighting against this is a complete and total lost cause. Much like anything that is entrenched with lots of rough edges, CSV will take forever to go away.

                                                          1. 6

                                                            I’m not sure the author is aware of how many legacy tools not only rely on CSV, but only know how to export CSV, and how many of the people who work with it are not technical.

                                                            I’m absolutely certain the author knows that. They specifically lament all of these facts, and cite their years of experience dealing with all of these problems.

                                                            This file was exported from some Oracle database (I think–it was never clear) and the people uploading were a sales team with basically no technical skills.

                                                            And if some other format were to become universal, as easy to export and import as CSV, that wouldn’t be a problem.

                                                            Much like anything that is entrenched with lots of rough edges, CSV will take forever to go away.

                                                            The author states this themselves. And they go on to hypothesize that if Microsoft and Salesforce both supported some other format, the rest of the world would come to adopt it.

                                                            You’re making all the same points that the author does, but without speculating on how these problems could be solved. The author never claims a migration would be quick, easy, or universal. Only that tackling the core use cases of Excel and database dumps would make future work far less error prone.

                                                            1. 4

                                                              Only that tackling the core use cases of Excel and database dumps would make future work far less error prone.

                                                              I don’t believe this, and I think that’s my problem with the article. You can design a wonderous format and people will always find a way to make it difficult to work with because they often don’t want (or think they don’t want) the restrictions that come with something that makes future work less error prone.

                                                          1. 23

                                                            Maybe I just don’t get where the fossil author is coming from but when I was using fossil regularly (and sometimes still do) I’d end up having two copies of the same repo… a short-lived one in whatever work-in-progress state with all the useless commits (“really fix edge case this time”, “friday WIP” etc.) and then a second persistent one where I’d rsync the worktree and make a new commit or, less easily, cherry-pick from the first. I’m not convinced that seeing the sausage being made in the commit history has any benefit over a single commit or small number of clean commits per feature.

                                                            1. 14

                                                              I’m not convinced that seeing the sausage being made in the commit history has any benefit over a single commit or small number of clean commits per feature.

                                                              Yes, the lack of ability to clean up history, even locally, is the one thing about Fossil that ensures I’ll never use outside of work, where I have to use it in some situations.

                                                            1. 2

                                                              The importance of being in the right place at the right time cannot be possibly overstated.

                                                              This is probably one of the most important things to take away from the whole thing. x86 and DOS were not grand architectural achievements so much as they were a benefactor of circumstance. The same is probably true for a lot of tech stuff we seemingly hold in high regard.

                                                              1. 2

                                                                But somehow to me it seems like the PC itself was a stroke of genius: they basically had already sunk a lot of R&D into making small computers and had to minimize cost and time to market in order to be able to compete with the likes of Apple and Commodore. So the idea of using cheap, off-the-shelf components was a great move for making this product a reality.

                                                                I’m kind of afraid of what this says for the expected quality of stuff that is successful in the market, though. OTOH, it does mean if you identify situations like this where there’s a lot of useful (but maybe crappy) base ingredients for a complex product, you can fit them together in a short amount of time and beat all competitors who are trying to do everything from scratch using in-house proprietary solutions. Thinks… isn’t that exactly what Amazon is currently doing with open source software in their SaaS platforms?

                                                              1. 16

                                                                I like the point about type keys, but one thing that distracted me was the refactoring in step 3. Honestly, the code in step 2 was more readable. The different approaches in step 3 are an example of the DRY principle taken too far.

                                                                1. 3

                                                                  I came to make the same comment.

                                                                  As a variant of Sandi Metz’s

                                                                  Duplication is far cheaper than the wrong abstraction

                                                                  I’ve recently come to:

                                                                  “A little duplication is better than one more abstraction.”

                                                                  The abstraction need not be wrong, or even artificial. It has a cost just by virtue of being an abstraction.

                                                                  1. 3

                                                                    createUserWithNotifications? I found that oddly specific as well.

                                                                    Admins may have different arguments for notifications so maybe you’ll end up encoding a kind of type flag in the call. Or encode a bunch of stuff in the names of four-line functions.

                                                                    The base premise is valuable but “overfitting the cleanliness” runs the risk of painting yourself in a corner and having to clean your way out again.

                                                                  1. 3

                                                                    I think my favourite thing about these older writings is how they have the same complaints that show up now.

                                                                    The art of thinking (before you code) often seems a lost art. Such safeguards as validating the input before you read it and keeping the user interface constant from one command to the next, are good things whose time has not (we hope) truly passed.

                                                                    This is dated 1989, now 32 years old. I have read similar sentiments in writings from the 1960s and 1970s. I posit that thinking before you code was never an art that was in regular practice, that looking at the past with rose coloured glasses has been in fashion the whole time, and much of what we think programming used to be like (namely, how great it was) is largely made up.

                                                                    1. 2

                                                                      At one point the authors talk about moving the system-specific stuff into their own files, hidden behind an interface. They say,

                                                                      However, in our experience, it’s much more usual for the different variants to be completely different code, compiled from different source files—in essence, the parts outside the #ifdef disappear.

                                                                      This is true, but what they don’t mention is now the “#ifdef” part becomes part of the build system. The logic of the variants doesn’t go away, it now gets moved elsewhere.

                                                                      1. 4

                                                                        This gets very difficult with C once you realise that you don’t have completely different systems, you have overlapping sets of functionality. You end up needing a generic POSIX implementation and then a bunch of specialised ones.

                                                                        This is one of the reasons that I much prefer C++ over C. I can write a templated class that uses a set of mostly portable APIs, with the template parameterised on some per-platform tweaks and a per-platform subclass can inherit from the templated class and then export the platform-specific type as the default implementation.

                                                                        For example, in snmalloc we have a POSIX implementation of our platform-abstraction layer (PAL) that should work on pretty much any *NIX. There’s a BSD subclass that uses a generic BSD extension and another subclass of this for members of the BSD family that provide strongly aligned allocation from mmap. The FreeBSD and NetBSD implementations are identical: we only maintain them as separate things so that there’s an extension point if we want to add platform-specific behaviour. Some systems, such as Haiku are a bit more custom, but the amount of code is fairly small. The only #ifdefs are for selecting which PAL to expose as the DefaultPAL type.

                                                                        All of the rest of the code that cares about the platform takes the PAL as a template parameter. This means that we can test the POSIX PAL on any POSIX platform and make sure that we haven’t broken it. We can also provide custom ones (e.g. for OpenEnclave).

                                                                        There’s no dynamic dispatch for any of this and the PAL functions are typically inlined everwhere that they’re used.

                                                                        Doing the same in C is a complete mess.

                                                                      1. 2

                                                                        Granted, I haven’t been involved in a massive project of the type this post is describing, but where this kind of disciplined process falls apart in my experience is that step 0 never finishes even after implementation has already started. New requirements keep coming in and old ones keep changing, and not all the changes can be deferred until after the initial version.

                                                                        Maybe it’s different when the project is sufficiently huge?

                                                                        1. 3

                                                                          New requirements keep coming in and old ones keep changing, and not all the changes can be deferred until after the initial version.

                                                                          This is why you have to learn to push back. The customer (or whoever it is) must understand that it does take time to figure all this stuff out, especially when the scope is large. Changing things on the fly is going to cost them. If step 0 is changing all the time, then it is very likely you are not in a large project. If you are in a large project and it is in constant flux, then you can be pretty sure your project is doomed.

                                                                          1. 2

                                                                            Definitely true that my projects haven’t been on the scale of what the post is talking about with eight-figure development budgets and so forth, so that may be the main difference.

                                                                            It doesn’t seem to be a matter of which industry you’re in. I’ve seen this happen even at a bank: Last year I had to throw out several weeks of work because all of a sudden the compliance department realized that implementing the product spec (which they had already reviewed multiple times and signed off on) would risk running afoul of money-laundering laws. And the compliance department at a bank, for good reason, has the authority to brush aside pretty much any amount of push-back from the engineering team short of, “It’s physically impossible to do what you want.”

                                                                            1. 1

                                                                              This is certainly a risk. It could happen in my work if the hardware teams decide to change something since we’re utterly dependent on them (and they are also our customer!).

                                                                              It’s important to realize that all you planned and worked on can be tossed with a spec change. The key is who is deciding on that spec. In the case you cited it wasn’t the development team. In retrospect, all the work put in was essentially pointless. The key is that (apparently) no one knew at the time the work was being done. To me, this is not an argument against planning and estimating. (And to be clear, I don’t think you are arguing against this.)

                                                                              The customer needs to be aware that large spec changes will cause significant delays. What gives planning and estimation a bad reputation is tying this all to hard delivery dates while continuing to absorb spec changes. Or to keep increasing scope of the project. Agile, in the broad sense, addresses this quite well. But I don’t see it as advocating for the banishment of estimation. Or planning. Or even forethought. I’ve always seen it as a way to get something that kinda, sorta works so that you can learn from it, even if it’s not customer facing. Doing that does not imply that mapping stuff out ahead of time is a waste.

                                                                              And hey, even thrown away work teaches you something.