1. 25

    I’m glad someone is taking a look at linkers, since compilers get most of the attention. Linking is a huge bottleneck, rarely parallelized, and massively complicated.

    I sometimes wonder is linking itself is even something worthwhile to have anymore, and that systems could switch to something like either merging compiling and linking into a single step, or having directly runnable objects that can be trivially fixed up.

    1. 33

      LLD is getting a lot of attention. It replaced the multi-pass algorithm that traditional UNIX linkers use with a single-pass one that is O(n) in terms of the number of symbols, rather than O(n*m) in terms of the number of symbols and number of object files. In exchange for this, it uses more memory, but the memory for the sum of all of the symbol tables in all object files in a large project is still pretty small compared to a modern machine’s RAM. It’s also aggressively parallelised. As the linked repo says, LLD takes 12 seconds to link Chromium, gold (which was a rewrite of BFD ld focused on speed) takes 50.

      It seems that the main speed win here is actually an accounting trick: it starts the linking process before compilation starts and so only counts the time between the last compile step finishing and the linker finishing. Because most of what the linker is doing does not depend on every object file, you can get most of the way through the link before you need all of the object files. This probably doesn’t get as much speedup with things like --gc-sections, which require having a complete symbol table so that you can do reachability analysis before you can do the final layout, but even then you can start copying things that are definitely reachable early, you just might end up with something in the final .o that requires you to pull in a load of things from other objects.

      Note that the author of this repo is one of the lead developers on LLD. Rui is awesome, after a 5 minute conversation with him I completely reevaluated how I thought about linkers. I hope that this is an experiment that he’s doing to prototype things that he’ll eventually add to LLD.

      As you your other suggestions:

      merging compiling and linking into a single step

      That’s what languages that do whole-program analysis (Go, Rust) do, and is also almost what LTO does. There’s a trade-off here in how much you can parallelise. If you completely combine final code-generation and linking then the compiler needs to know the locations of everything else in the object file. That’s not feasible because of circular dependencies and so you end up needing something like relocations and a final fixup step, which is doing part of what a linker does.

      The Go (Plan9) toolchain actually does this slightly differently to other things. It combines the assembler and linker so the compiler emits pseudo instructions for indirect references and the assembler / linker expands them into the shortest instruction sequence that can express the displacement that’s actually needed. This is a big win on a lot of architectures: if you can materialise a 16-bit constant in one instruction and a 32-bit one in two, your linker / assembler can do some constraint solving, try to place functions that call each other close together, and insert the single instruction in the common case and the two-instruction sequence for the places where it’s needed. Without this merging, you end up doing one of two things:

      • Use the short form and require the linker to insert thunks: jump with a short displacement and if the target is too far away, insert a trampoline somewhere close that has a three-instruction sequence, burning some branch predictor state and adding overhead when you need to do this.
      • Always emit the long sequence and often have a no-op instruction that writes 0 into a register that’s already 0.

      C++ places a lot more demands on the linker. The modern C++ compilation model[1] generates every used template instantiation in every compilation unit that uses it, puts them in COMDAT sections, and relies on the linker throwing them away. This means that the generated instantiations are available for analysis in every module, but it also means that the compiler does a lot of redundant work. Modules and ThinLTO are intended to address this: most template instantiations will have a single home but can be pulled into other compilation units for inlining if it would help.

      having directly runnable objects that can be trivially fixed up.

      We call this ‘dynamic linking’. It is slower, because there’s still a linker, it just happens every time you run the program. Back when I used BFD ld, I always build LLVM as shared libraries for my iterative compile cycle, because a debug build of clang took 5 minutes to link. I’d always do a statically linked build before I ran the test suite though, because the linking time (even with BFD ld) was less than the difference in time running the test suite (which invokes clang, opt, llc, and so on hundreds of times).

      There’s always a tradeoff here. Code that is faster to link is slower to run. You can make code fast to link by adding a small indirection layer, putting everything in a single big lookup table, not doing any inline fixups and making any access to a global symbol go via that table. You can make it fast to run by aggressively generating the shortest instruction sequence that materialises the exact address. Dynamic linking generally doesn’t want to modify executable code for two reasons: it’s bad for security to have W&X code and it prevents sharing[2]. This means that you end up with indirection layers. Look up ‘copy relocations’ some time for the extreme case of weirdness ere.

      [1] The original 4Front C++ compiler parsed linker error messages for missing symbols and generated the template instantiations once, on demand, in an iterative cycle.

      [2] 32-bit Windows DLLs actually did something differently here: DLLs were statically relocated and expected to run at a fixed address. There was a fun constraint solver that tried to find a place in the address space for all DLLs that works with all installed EXEs. If it failed, the DLL would be statically relocated on load and would not be shared.

      1. 6

        but the memory for the sum of all of the symbol tables in all object files in a large project is still pretty small compared to a modern machine’s RAM.

        … no? I frequently experience OOMs when linking big C++ projects. With medium-scale programs, linking has to run with just one process at a time to not get OOM killed (with my 16GB of RAM). With truly huge projects, such as Chromium (especially with debug symbols), I have to add immense amounts of SWAP space and just let the linker process thrash for a long time.

        In my experience, linker memory usage is the main limiting factor with big C++ projects, and it’s one of the main reasons I’m moving to at least 32GB of RAM the next time I upgrade anything.

        1. 1

          What linker are you using? I haven’t had that since the bad old days of BFD LD. A debug build of LLVM generates about 40GiB of object files, but symbol tables are a tiny fraction of that. The vast majority of it is… object code. That doesn’t have to be resident for lld to work. Last time I checked, lld would mmap all of the object files, but that doesn’t consume any RAM unless you have spare RAM: if RAM is constrained then the OS can just evict the object files from memory.

      2. 8

        having directly runnable objects that can be trivially fixed up

        How trivial can it get?

        The fundamental problem that linkers solve is that if you do incremental compilation, disparate modules need to be able to call into each other. ‘Linking’ just means resolving symbols into addresses; you need to do that anyway at some point. So I don’t disagree that it can get faster, but the essential complexity is still there.


        The zig compiler has been making some cool strides there, incidentally. It sidesteps the issue entirely by doing binary patching. Common lisp does something similar, with its ability to redefine symbols in a running image.

        1. 4

          …that systems could switch to something like either merging compiling and linking into a single step…

          That would likely require all the source and objects to be known at compile time, which probably means getting rid of dynamic libraries. A lot of the complications of linking comes from dynamic libs.

          1. 4

            The abstraction layer is already broken by LTO. Now the linker is an all-in-one compiler, too.

            1. 13

              In GCC, LTO is a “sort of” compiler. What happens is that GCC calls collect2 (its linker wrapper) that then calls ld with the plugin options to have it run lto-wrapper. lto-wrapper has all the objects taking part in the link. It then calls the compiler driver with special options to make it do whole program analysis. This phase then partitions the objects and runs GCC again on all those partitions, possibly using Make. Eventually all the objects that get created from this go back to the linker, which then links them.

              It’s the Unix philosophy at work. And it’s murder to debug.

              1. 5

                The unix philosophy is about composable, reusable tools. Without reuse in different contexts, it’s just an inconvenient split of functionality.

                How would you compose these tools differently to accomplish a different goal with them?

              2. 5

                This isn’t targeted at release builds, though, it’s targeted at development builds.

                I don’t think anyone is using lto for development builds, even if they use regular optimizations.

            1. 9

              Linkers are long overdue for an upgrade. Not just speed leaves a lot to be desired, but the whole compiler-linker interaction is messy. Library include paths are disconnected from source includes and can get out of sync. Flags specified in a wrong order cause weird failures. The linker generally has no idea what the compiler was trying to do.

              When I have an error in the source code, compiler will bend backwards to help and suggest a solution and point it with squiggly underlines. Meanwhile linker will just print “missing _Zzxc329z9xc in asdf.o, bye”.

              1. 8

                It’s perhaps underappreciated how much the compiler driver (not the compiler proper) knows about calling the linker and how much it does for you. If you run gcc -dumpspecs you’ll likely see the *link_command directive at the end – and it’s probably an impenetrable mess. The compiler puts all the startup, system libraries, and finalization objects in the correct order (notwithstanding user libs) and also deals with the rather obscure options that are required depending on how you compiled the code.

                If you had to link things yourself, you’d quickly find it’s very difficult. It requires a fair bit of knowledge of the C library you’re using and the system itself. Compiler drivers handling this for you is very helpful, with the caveat that there ends up being a lot of abstraction making it difficult to decipher what is going on (cough collect2 cough).

                1. 7

                  Yea, that part is annoying. When writing either makefiles or build systems, it would’ve been nice if you had one kind of rule to compile C files, one kind of rule to compile C++ files, one kind of rule to compile fortran files, and one kind of rule to link object files into an executable. But due to the immense mess of linker options, you can’t just do that. You want to link with the C compiler if your object files are only from C files, link with the C++ compiler if your object files are from C++ files or a mix of C and C++, link with the fortran compiler if your object files are from fortran, and I don’t even know how you would do it if your project has C, C++ and fortran files.

                  It’s not nice.

                2. 8

                  Library include paths are disconnected from source includes and can get out of sync

                  MSVC and clang have pragmas that you can put in your header to control what is linked.

                  Meanwhile linker will just print “missing _Zzxc329z9xc in asdf.o, bye”.

                  Every linker I’ve used in the last decade will print an error with a demangled C++ name and tell me which function referenced the symbol that was not found:

                  $ cat link.cc
                  int x(int);
                  
                  int main()
                  {
                          x(5);
                  }
                  $ clang++ link.cc
                  /usr/bin/ld: /tmp/link-ab4025.o: in function `main':
                  link.cc:(.text+0xa): undefined reference to `x(int)'
                  clang: error: linker command failed with exit code 1 (use -v to see invocation)
                  
                1. 3

                  git doesn’t have actual command to un-stage a file(s), though. To get around this limitation…

                  Limitation, or poor UI decision? I’m guessing the latter.

                  1. 10

                    newer versions of git have git restore so I think that counts

                    1. 5

                      git reset -- filename or git reset HEAD filename do the same, tho, right? And that’s been in git for ages.

                      1. 5

                        I know, just wanted to say there is now an actual command. The article claimed there wasn’t one.

                        1. 1

                          Sometimes. If the file is already in HEAD then this works, but if it’s a newly created file I don’t think this works.

                          1. 2

                            It definitely works with newly created files.

                      2. 4

                        The naming problem. There is, and always have been, git reset that does what OP wanted, however the “feeling” that this one command does “a lot of different things” (reset staging and reset working tree, depending on the flags) is what made people say it doesn’t have such command.

                        1. 3

                          I use tig which makes unstaging single files easy and natural, among other things

                        1. 7

                          I think this is the biggest advancement in Rust so far. I’ve been concerned about laying a finger on Rust for a long time because I have a lot of concern about Mozilla and it’s ethics. After the Brendan Eich ordeal it was difficult to rationalize any kind of investment in a company that behaved that way.

                          Bringing Rust into the realm of the Software Foundation, I can say I’ll be following Rust with a renewed interest. Zig beat them to it, and I’d gladly consider a new project in Zig before I chose Rust, for obvious reasons, as I’m sure we’re all pretty sick of hearing about “what Rust can (kinda) do”, but all the same, these are some big names getting behind the project. That much can’t be ignored.

                          2021 is shaping up to be a fast-moving year in PLs and PLT research as well.

                          1. 15

                            as I’m sure we’re all pretty sick of hearing about “what Rust can (kinda) do”

                            At this point I hear more people saying this than I do people actually evangelizing Rust.

                            1. 23

                              Brendan Eich did something that made it hard to believe he’d be fair and welcoming in a global project that extremely heavily depends on public goodwill and participation of volunteers.

                              (And he continues to make controversial, and frankly dangerous and stupid, public statements today. He denies that face masks work during a global pandemic, and actively discourages people from listening to public health experts, for example.)

                              His job was to be the face of the company. People freely chose not to associate with a company who chose someone controversial as their face. Enough people made this free choice that it seemed wise to pick someone else.

                              I never understood why this was so terrible. What is the alternative? Force people to use products and services they don’t want to use? Forbid people from making purchasing decisions based on how they feel about employees at a company?

                              1. 12

                                Enough people made this free choice that it seemed wise to pick someone else.

                                I never understood why this was so terrible.

                                TBH I assumed the bad behavior referred to was that they kept a shitbag like Eich around as long as they did.

                                1. 6

                                  A diverse opinion being rejected in a group inherently portrays that group as exclusive in nature. Bubbling themselves in has alienated a lot of possibilities. Look at recent cuts Mozilla has to make, look at FF’s market share in the browser realm. I see W3 reporting FF as lower than 10% these days.

                                  I don’t know about his opinions on these things, I’m not really trying to open a discussion about Eich, I’m not his follower, I am just presenting the novel idea that booting people for their personal opinions leads to failure and echo chambers and whatever else.

                                  His JOB was co-founder. He CO-founded Mozilla. That’s different than being hired as CEO “here, go be the public figure, soak up those headline bullets and shut up on socials”.

                                  Anyhow, I’m not a Mozilla history expert. I don’t think it’ll be relevant in 20 years, afaict it’s already dead.

                                  Rust however, need not die, insofar as enough resources are dedicated to its longevity factor. Rust needs major reworking to be able to integrate new compiler core team, there’s major work needed to improve syntax issues, there’s HUGE work needed to do something about the compile times. I’ve seen users describe it as “not a better C, but a better C++” and I think that’s a decent designation for it for the time being. Still, without major pivoting and better resource and labor allocation, the project is in big trouble. I see people working on Rust who tweet 50-70+ a day. How productive can they really be???

                                  It’s whatever. I really like the idea of a software foundation. It’s definitely going to be helpful to have diverse minds in diverse fields bouncing ideas around. It’s great.

                                  1. 22

                                    May I refer you to the Paradox of tolerance? Groups that want to maximize diversity must exclude those who are against diversity.

                                    Eich gave public material support to Prop 8. He could have pretended he doesn’t support it, he could have “shut up on socials”, but he chose not to.

                                    1. 16

                                      I remember when he was debating this on Twitter. His response was that people not wanting to work with him because of his support of Prop 8 (which would make same-sex marriage illegal) was “fascism”.

                                      Of course, he said this to people who were freely choosing to not associate with him based on their own opinions and his public statements…while he himself was supporting expanding government power to limit the kinds of associations consenting adults could participate in.

                                      One of those is was way more “fascist” than the other.

                                    2. 14

                                      A diverse opinion

                                      This characterization is both insufficient and inaccurate.

                                      1. 12

                                        His JOB was co-founder. He CO-founded Mozilla. That’s different than being hired as CEO “here, go be the public figure, soak up those headline bullets and shut up on socials”.

                                        No one complained about him until they hired him to be the CEO. I didn’t even know his name before that and I bet a lot of other people are in the same boat. You seem really offended by something but you don’t seem to know what it is…

                                        1. 7

                                          Still, without major pivoting and better resource and labor allocation, the project is in big trouble

                                          You would know better than I would, but this is honestly the first time I’ve ever heard anything other than “Rust is the language of the future and there’s no need to learn anything else ever.” I’m being only slightly facetious.

                                          Seriously, though, from a mostly-outsider’s perspective, it seems like Rust is going nowhere but up and seems to be poised to take over the world. I suppose there’s a difference between Rust-the-language and Rust-the-project, but they’re pretty much identical to me.

                                          1. 6

                                            I see people working on Rust who tweet 50-70+ a day. How productive can they really be???

                                            This is patently ridiculous as an argument.

                                        2. 12

                                          Mozilla did not own or control Rust at any point. The Rust project started out managed by Graydon Hoare in 2006, and Mozilla began financially supporting it in 2009 (link to the history). Mozilla did own the trademarks for the Rust and Cargo names and logos, which were controlled and licensed in an open manner, and protected only to the degree necessary to avoid implied official status or endorsement by the Rust project (link to the policy). Mozilla also paid for the salaries of developers who worked on Servo, for some period one of the two largest Rust projects (the other being the Rust compiler itself), as well as the salaries of some folks who worked on the Rust compiler. However, Mozilla did not exercise or influence the direction of the Rust language, and from an early period a majority of Rust’s contributors, including members of the Core Team and other Rust Teams, did not work for Mozilla.

                                          1. 4

                                            what Rust can (kinda) do

                                            I’m curious what this bit refers to

                                            1. 1

                                              This could refer to many different parts of a immature ecosystem like GUI programming

                                          1. 20

                                            Most candidates cannot solve this interview problem:

                                            Input: “aaaabbbcca”

                                            Output: [(“a”, 4), (“b”, 3), (“c”, 2), (“a”, 1)]

                                            Write a function that converts the input to the output I ask it in the screening interview and give it 25 minutes.

                                            How would you solve it?

                                            I still think this is the best answer to the question.

                                            def func(x):
                                                return [("a", 4), ("b", 3), ("c", 2), ("a", 1)]
                                            
                                            1. 2

                                              Without a close examination, I’d wager most compiler improvements in the past ten years are about keeping up with language standards, adding more support features, and fixing bugs. Probably the most noteworthy changes have been the various sanitizers and better warning/error messages, neither of which are very simple things to do.

                                              1. 6

                                                To generate a GUID, you can use the Online GUID Generator website

                                                Perhaps using the New-Guid cmdlet in Powershell would be simpler.

                                                1. 1

                                                  Oof. You saved me from reading this through to the end…

                                                1. 5

                                                  YAGNI, SOLID, DRY. In that order.

                                                  As long as they all come with the caveat stated earlier in the post, namely not being a goal in and of themselves.

                                                  As for the rest, it seems accurate to me.

                                                  1. 1

                                                    I strongly believe that learning a new codebase happens best through implementing real features.

                                                    I have found assigning bugs to be just as good. Especially if the bug has a clear endpoint. Fixing bugs practically forces you to read and understand the code, whereas new features could allow you to avoid more of it. (This is dependent on the nature of the feature, of course.) Fixing a bug usually means you have to understand something beyond the surface nature of the code. If you can find the root cause of a problem and explain it to others, you are well on your way to understanding the code base.

                                                    Sometimes you’ll find a pull request that implements something very similar to what you want to do, and you can use that as a guide.

                                                    One of the most useful things that too many devs decry as “excessive process” is that every commit be tied to a ticket or design document of some kind. It’s so incredibly helpful when you’re digging through commits from weeks or months (or years) ago as it can provide background as to why a commit was made in the first place. I have had to search the commit and Changelog history of GCC for the past couple of years and it’s maddening to find the commit history with no explanation. I have to dig through the mailing list or (god forbid) the wiki in the hopes I can find any justification for the change — and I rarely find it. (The alternative to this is to write all the explanation in the commit message, which is also good and probably more suitable for open source projects.)

                                                    1. 7

                                                      It seems to me that the CARE (Code Aspires to Resemble Explanation) idea is fine, but I’m not seeing how DRY has anything to do with what is going on here.

                                                      Here’s some smells that might indicate code could be more CAREful: … The existence of comments explaining what is happening, suggesting the code doesn’t explain itself well.

                                                      I wince when I read advice like this these days. I suspect this is because I’ve always seen it come out of examples that are like something you would find in an introductory programming textbook. A lot of the time having a “what” comment is incredibly useful, as long as it’s not an obvious repeat of the code itself. This is, of course, not very practical advice. That’s why I can’t say the author’s advice is wrong (it’s not), but I do think it’s overly simplistic.

                                                      I guess I’m just reacting to the simplicity of it, and my experience in seeing it misapplied.

                                                      1. 4

                                                        but I’m not seeing how DRY has anything to do with what is going on here.

                                                        Actually I do believe this is relevant. Blindly following DRY instead of thinking about how this helps structure code to support its narrative is more easily avoided if you follow CARE as an overarching principle.

                                                        Take the example of going to extreme lengths (crossing multiple namespaces or providing a global interface just for this purpose) to link to a library that is otherwise never used in an isolated part of your code, just because you want ro re-use a simple method it contains.

                                                        This might not be such a great idea, and CARE hints at why: it would make the part much harder to explain. Instead of just saying ‘this part takes the time and deterministically outputs the current positions of all moving objects’, you’d have to add ‘AND it refers to the math library from the financial module from the in-app purchasing module because that already contained a sum method’.

                                                        1. 1

                                                          It seems to me that the CARE (Code Aspires to Resemble Explanation) idea is fine, but I’m not seeing how DRY has anything to do with what is going on here.

                                                          This is good feedback, thank you! To expand a bit about how I see the relation with DRY: I think of both DRY and CARE as examples of heuristics I can use to help improve a bit of code. DRY will drive me to trigger on duplication and stamp it out. CARE invites me to consider whether the code looks like the explanation I would give of it. I think code will come to look pretty different depending on which heuristic I apply.

                                                          A lot of the time having a “what” comment is incredibly useful, as long as it’s not an obvious repeat of the code itself.

                                                          I do believe there’s situations where “what” comments can be useful. One that’s top of my mind is when performance requirements require me to compromise a bit on code legibility. I don’t believe that performance-optimized code would turn out particularly CAREful, but with good reason.

                                                          I’m curious, were you also thinking of particular types of situations where you feel “what” comments are particularly useful?

                                                          1. 5

                                                            I’m curious, were you also thinking of particular types of situations where you feel “what” comments are particularly useful?

                                                            Here’s one of my favorite examples:

                                                            b=( "${a[@]}" ) # copy array
                                                            

                                                            That’s the shortest correct way to copy an array in bash. If you use bash a lot you’ll probably memorize it, but if you’re touching bash once every couple months then the comment will save you a lot of pain.

                                                            1. 4

                                                              This is the kind of comment (or sometimes documentation) that I often see added during code review when the reviewer isn’t familiar with the library/language/etc. and requests it. It is not usually some subtlety, and when the reader becomes more familiar they don’t tend to ask for it in the future. It rubs me the wrong way because it is usually addressing a different “audience” than the rest of the comments, generally the point of writing this program is not to teach the next person who reads this about X, and the program that ends up with the comment explaining X ends up being arbitrary, as does the particular X that happens to trip somebody up this time.

                                                              This particular instance is kind of compelling, but I wonder if that’s just because I don’t write enough bash, and I get the impression you add this comment consistently which maybe changes things.

                                                              1. 5

                                                                Most code can and should assume the reader is familiar with the language in use, but in the case of code written in a language that’s rare-within-the-codebase (a role which eg bash & make often fill), I don’t think that’s a good assumption.

                                                                1. 2

                                                                  I agree with your misgivings, but on a team using multiple languages, with developers of varying experience and familiarity with each, you’re better off anticipating likely pitfalls and steering people away from them with over-explanation, or just saving them googling time, than saying “well, an X programmer should know this, and if you don’t, it’s on you” – even if the latter is true.

                                                                  1. 1

                                                                    it’s on you

                                                                    To be clear I’m happy to explain in say the discussion/email etc., it’s just immortalizing that explanation in the source that tends to put me off.

                                                                    1. 1

                                                                      Sure, I didn’t mean suggest you were being dickish. The problem is on a large code base at a big company, with people working in multiple time zones, you won’t always be around, eventually you’ll leave or move to a new team, etc, etc. Again, I agree there is something ugly about it – a better long term solution imo to document the feature set of the language everyone is expected is to know, and invest in training – but that’s a lot of work, and if it’s not going to happen comments are a decent mitigation.

                                                                2. 2

                                                                  b=( “${a[@]}” ) # copy array

                                                                  This comment would help me a lot as I’m not too familiar with Bash myself.

                                                                  In the post I describe a “what comment” as a smell that might indicate code isn’t particularly CAREful, and I do feel that applies to this example. Specifically the word “copy” is essential to an explanation of the code but doesn’t appear in the code itself. I think it might just be harder to write CAREful code in some languages than in others.

                                                                  That said, I wonder how you’d feel about these alternatives to the comment:

                                                                  • Extract the copying logic into a bash function called copy_array. If that’s possible. I know Bash has functions but am not familiar enough to know for sure we could use them in this situation.
                                                                  • Rename variable b to copy_of_a.
                                                                  1. 5

                                                                    Option 1 is basically not possible in bash (functions don’t “return” values) unless you use a nameref which will be far worse than the comment and not supported in all bash versions anyway.

                                                                    copy_of_a might be a good alternative but not always, since sometimes it won’t semantically describe the new variable clearly (ie, it describes how you produced the new variable but not the meaning of that copy in the narrative of your code, thus violating CARE)

                                                                3. 3

                                                                  I’m curious, were you also thinking of particular types of situations where you feel “what” comments are particularly useful?

                                                                  Any time a name is ambiguous or reading the code would take much longer than reading the comment. Of course, both of these things can (and usually should) be considered code smells, but any time you aren’t able to find a perfect name, or make the code completely crisp, a short comment that will save a future developer reading time is usually a better alternative than blindly following “don’t comment what.” The master rule is: If I were coming to this for the first time, would the comment make my experience better or worse?

                                                                  1. 2

                                                                    Don’t forget about the costs of a comment beyond the time spent reading it when someone first encounters that code.

                                                                    At some point the commented code might change. If the developer then remembers to update the comment too, it will cost them only a little time. But if they don’t update it, the comment will cost future readers a lot of time as they recover from being misled by a comment that doesn’t match the behavior of the code.

                                                                    1. 1

                                                                      Agreed. The cost/benefit is a judgment call and prediction based on the likely needs and probable mistakes of future readers. Also a good argument for clarity and brevity when you do make comments.

                                                                  2. 2

                                                                    I’m curious, were you also thinking of particular types of situations where you feel “what” comments are particularly useful?

                                                                    In codebases like GCC. Here’s an example in the integrated register allocator. It’s code I had to work through recently. Stuff like the comment on line 3811 are incredibly useful when you’re reading through. It’s code that you have to read to understand the workings of other parts of the code (and probably aren’t directly related), but it’s something you will rarely have to touch. Having comments that guide you through the steps is a godsend.

                                                                    The obvious retort here is to refactor so that it’s clearer, but that is very unwise. For one thing, the data structures would have to be redone because they are multi-purpose and meant to be generic (otherwise you’d have so many alises or copies it would be more confusing). Another is the sheer legacy to be overcome. And then there is the testing. This kind of code is really tricky to get right, and messing with it can be an awful lot of work.

                                                                1. 19

                                                                  Maybe not the answer you’re looking for, but I generally don’t go actively looking for new tools.

                                                                  If I’m at a job I’ll learn what they use, and sometimes that’s new tech. If I’m on a project and something falls across my desk while researching I might follow up on it. Perhaps most frequently, if people I’m chatting with bring up “oh have you used so-and-so” I might investigate it if it looks promising.

                                                                  If I’m really, really stuck on something and I can clearly articulate what my problem is and it isn’t worth it to solve it myself, then I might go looking for a new tool–in which case, polling coworkers, Googling, and so forth is a good idea.

                                                                  In general, though, I dislike actively looking for tools because:

                                                                  • There are a lot of people selling shovels right now, and every one of them is only a marginal improvement over what I already use.
                                                                  • There are massive amounts of posturing and social interaction over tooling and displays about tooling–consider people preening and arguing over vim and emacs, or rust and c++, or whatever…lots of noise, little signal.
                                                                  • It’s easy to succumb to the fear of missing out and make bad technical decisions just so you have the latest tools.
                                                                  • The work I tend to do (or at least like to think that I tend to do) either tends to be so boring that any old tool will do (and so I’ll just use what I’m quickest with) or so weird that like no tool is really gonna solve my problem.
                                                                  • I find that I get more mileage learning about and testing the boundaries of my boring old tools than perpetually hopping from one thing to the next. I can solve most problems with Postgres, for example, and even the other day I learned a neat thing about how query planning works (or doesn’t) with its jsonb type–whereas if I’d just hopped to a new tool I probably would’ve overlooked it.

                                                                  Maybe the biggest thing for me is that, at the end of the day, I get enjoyment from actually solving problems and building things, and it is hard to do that if you keep finding strangers in your toolbox.

                                                                  1. 16

                                                                    If I’m at a job I’ll learn what they use, and sometimes that’s new tech.

                                                                    There’s no greater friction than going against the grain of what your organization is built around.

                                                                    1. 7

                                                                      I find that I get more mileage learning about and testing the boundaries of my boring old tools than perpetually hopping from one thing to the next.

                                                                      This really deserves to be repeated more. People are always harping about “using the best tool for the job”, and while that may be true, the best tool might only look like the best tool from a distance. When you get to know the tool better you’ll almost certainly find inadequacies that are (deliberately or not) being hidden. Alternatively, it might be the best tool but the learning curve could be so high that you will only be able to use it effectively after falling on your face a few times, and your current project that needs it might not be the right place or time to learn. With the tools you already know, you will know how to avoid the sharp edges and make the most effective use of them.

                                                                      Of course, it’s important that you don’t stagnate, as technology does progress, and you don’t want to be that one dude that keeps clinging to his ancient practices (like for example in the early 2000s when people would be using shared FTP and manual backups instead of VCSes). Although I do find that more often than not, it seems like we as an industry are taking two steps forward, one step backward every single time.

                                                                      1. 1

                                                                        I’m in this camp. I often find that my hunt for a new tool comes out of laziness. I know full well that I have what I need in front of me, but I can’t summon the ambition to make it work. That’s usually when I’ll take a step back and restate my assumptions about what it is I’m working on.

                                                                      2. 2

                                                                        solving problems and building things, and it is hard to do that if you keep finding strangers in your toolbox.

                                                                        This is a fantastic point (and also a wonderful turn of phrase). Sure, there may be a “better” tool to use but oftentimes the tool you know will help get to a solution more successfully (faster, less bugs etc) than fumbling around with something unfamiliar.

                                                                        1. 1

                                                                          This is me as well. Maybe it’s because I don’t really bother with software as a hobby (my hobbies are time consuming and not computer related), but I pretty much only focus on the current tool chain in place at my work. That has meant learning new tools (web frameworks, database admin tools, new languages), but I tend to focus on mastering what is in place instead of chasing the next shiny thing (that might never be used at my place of work anyway.)

                                                                          I used to worry that this would cause my skills to atrophy, but I’ve mostly found the opposite. Like you say, a lot of what is out there is shovels. I also mostly focus on backend work, so I get more value out of my time understanding design patterns in my language than in searching for yet more frameworks.

                                                                        1. 4

                                                                          What worries me is that, basically, there is no real solution to messaging right now. So anything I might choose and decide to recommend is me betting that it take a bad turn. But at the same time, I can’t betray people’s trust all the time by saying X was bad, Y is better (for now). And putting it as it is, “X appears to be good enough for now” doesn’t sound confident enough to motivate friends into switching. So all that is left between alarmism and realism appears to be cynically advocating for something like Signal, not because it is the best, but because it is the most probable to disrupt the current landscape held together by the network effect. Until then, you can just hope that there will be a proper solution, i.e. something secure, with a specification and without dependence on a single organization.

                                                                          1. 9

                                                                            Until then, you can just hope that there will be a proper solution, i.e. something secure, with a specification and without dependence on a single organization.

                                                                            Maybe it’s time to start wondering whether a decentralized or multi-organizational tool is actually worse. So far, any attempt at them has not worked and the outlook is not good.

                                                                            What worries me is that, basically, there is no real solution to messaging right now.

                                                                            What is a “real” solution? Something with a spec and decentralized, as the quote earlier suggests?

                                                                            …the most probable to disrupt the current landscape held together by the network effect.

                                                                            I posit that any messaging system will require the network effect. Making a good protocol, for example, is not nearly enough.

                                                                            1. 3

                                                                              Maybe it’s time to start wondering whether a decentralized or multi-organizational tool is actually worse.

                                                                              The advantage of a non-centralized network is that there is no central point of failure, neither technical nor social, which I think is important. But of course, it is more difficult to implement, which I believe is the reason why attempts at this have historically been worse. I’m cautiously optimistic about Matrix though.

                                                                              What is a “real” solution?

                                                                              To oversimplify: Something that isn’t a compromise.

                                                                              I posit that any messaging system will require the network effect.

                                                                              Conversely, by weakening the network effect of already existing networks makes it easer for newer solutions to compete.

                                                                            2. 10

                                                                              I would totally prefer to build on top of a incentive aligned protocol enabling secure and cheap communication. Signal is not that.

                                                                              But bitching about some fringe theoretical gripes of technical folk at the moment when alphabet-soup groups syphon out all the communication data.. it’s just shortsighted. Signal is a tool ready for mass consumption. Alternatives are really not even close. Including everything Matrix and XMPP.

                                                                            1. 2

                                                                              I would like to propose a new tag: Algorithms

                                                                              When suggesting new tags it is good practice to list a bunch of lobste.rs submissions that would deserve the tag. I haven’t seen many articles that are purely about algorithms so I’m not sure I support the creation of an “Algorithm” tag.

                                                                              Failing that, can someone suggest the correct tag to use if one submits an algorithm?

                                                                              When no tag applies to the story you’re submitting you can use the “programming” one.

                                                                              1. 3

                                                                                There have been several occasions where I’ve come across algorithms and come here to submit them, then not done so because they’re not about programming, they’re about algorithms, and no other existing tag really fits.

                                                                                So perhaps there’re aren’t many existing submissions that would use the tag because the tag is absent, and people thereby don’t submit things that would use it.

                                                                                Then again, maybe an article that’s about an algorithm without being about programming doesn’t really fit the direction of this site, but that’s why I’ve asked the question.

                                                                                1. 7

                                                                                  I personally would like to see algorithm posts, so I do recommend posting them, under compsci tag. After a while we could check how they are received.

                                                                                  1. 3

                                                                                    There have been several occasions where I’ve come across algorithms and come here to submit them, then not done so because they’re not about programming, they’re about algorithms, and no other existing tag really fits.

                                                                                    I suspect that compsci or programming or math could all work.

                                                                                    1. 3

                                                                                      Could you list some of the articles you wanted to submit but didn’t?

                                                                                      1. 1

                                                                                        It’s been too long, I don’t have any in mind now, and can’t remember those that I passed over. I know that Lobsters is more focussed and usually don’t consider posting things here because my interests are, in general, not strongly aligned, so when I’ve been reading, Lobsters has usually not come to mind.

                                                                                        I know there have been some, but I can’t say what they were. After this discussion I’ll pay more attention.

                                                                                        1. 1

                                                                                          When in doubt, just post it with the best fitting tag, or programming if you’re not sure. Few will mind if you use a “wrong” tag if it’s an interesting article and the suggest mechanism can correct this.

                                                                                  1. 1

                                                                                    Another great (!) article from the StackOverflow blog, who have previously also talked about editors and presented Vim and Emacs as unusable. The solution to sharding is not to introduce a datastore which doesn’t have a proper schema and any sort of relation, it is fixing the existing databases. The lack of a schema is listed as a good thing, but it becomes a nightmare if you have an existing application. Adding new things to the application that can use non-existent keys means either adding doc.value || "" everywhere, or re-implementing what SQL does for you with default values anyway. Additionally, as the article mentions, lack of joins means that you will have to embed related data within the row, which leads to duplication and bloated stores. Now, I completely agree with the sentiment that creating indexes is hard and it’s hard to properly optimize a database, but that doesn’t mean we should just throw our hands up in the air and just embed related data in each row (duplicating it in the process).

                                                                                    1. 2

                                                                                      Another great (!) article from the StackOverflow blog, who have previously also talked about editors and presented Vim and Emacs as unusable.

                                                                                      That article was particularly terrible. Citing it as support for this one doesn’t really bolster any confidence in me that this one is any good.

                                                                                      1. 2

                                                                                        To clarify, this article doesn’t cite that one, it’s just by the same authors.

                                                                                    1. 8

                                                                                      I work for the company that makes the device targeted by this linker script (I work on the compiler toolchain) and I am very familiar with GNU ld linker scripts. I consider its language to be the worst I’ve ever used. I am not being hyperbolic. If you can go without having to touch them, then I recommend you do so.

                                                                                      This is a nice primer for understanding the basics of linker scripts. One thing that was a bit underemphasized is the tight integration with the startup and library code. You need KEEP around all those input section descriptors because the linker will remove them if they are not referenced and you use --gc-sections, which is typical when building for embedded devices. Most of those symbols are “magic”. (Also note: they all start with an underscore because they take advantage of the rule in C that says external identifiers that start with an underscore are reserved.)

                                                                                      Things get really with hairy with GNU linker scripts when you have to start placing things at specific memory locations. This is common on such devices. This linker script is very simple. There is no oddball memory configuration with heavily customized placement of code and/or data. The GNU linker script language is really, really bad for doing these things.

                                                                                      I am surprised there is so much concern about heap space. A SAM D21 only has, at most, 32KB of SRAM. Apps on such small devices general don’t (and shouldn’t) use malloc. You open yourself up to many problems this way. It is better to restructure your app so that all memory space, aside from stack space, is known at build time.

                                                                                      1. 4

                                                                                        I am very familiar with GNU ld linker scripts. I consider its language to be the worst I’ve ever used.

                                                                                        Are there any good alternatives? There were some discussions on the lld mailing lists about adding something different but no non-GNU linkers seemed to have something that was sufficiently powerful for the things that people use linker scripts for and / or as readable as the incredibly low bar set by GNU ld. I’d love to see a better replacement.

                                                                                        1. 1

                                                                                          I haven’t used the linker from IAR, but I have seen their linker control files and they are much simpler to read. I can’t speak to their expressiveness. SEGGER has a new linker and its syntax looks a lot better; I haven’t had time to really look into it. Both of these are geared toward embedded systems, of course. As for large projects, I simply don’t know what might be missing.

                                                                                          I don’t know that there is a good alternative to GNU ld linker scripts, which is disappointing, really. I suspect it’s because writing a linker is a lot of work so you just use what’s there. If lld is looking at something else, maybe I should see what I can do to help out. I would love to be able to provide something more approachable than the mess that is GNU’s syntax.

                                                                                      1. 13

                                                                                        It’s little things like this that add up.

                                                                                        Indeed. It’s for this reason I’ve given up on Linux as a desktop OS and only use via SSH these days.

                                                                                        1. 11

                                                                                          This seems like a kind of arbitrary list that skips, among other things, iOS and Android, and that compares a list of technologies invented over ~40 years to a list that’s in its twenties.

                                                                                          1. 7

                                                                                            I noticed that Go was mentioned as a post-1996 technology but Rust was not, which strikes me as rather a big oversight! Granted at least some of the innovations that Rust made available to mainstream programmers predate 1996, but not all of them, and in any case productizing and making existing theoretical innovations mainstream is valuable work in and of itself.

                                                                                            In general I agree that this is a pretty arbitrary list of computing-related technologies and there doesn’t seem to be anything special about the 1996 date. I don’t think this essay makes a good case that there is a great software stagnation to begin with (and for that matter, I happened to be reading this twitter thread earlier today, arguing that the broader great stagnation this essay alludes to is itself fake, an artifact of the same sort of refusal to consider as relevant all the ways in which technology has improved in the recent past).

                                                                                            1. 2

                                                                                              It’s also worth noting that Go is the third or fourth attempt at similar ideas by an overlapping set of authors.

                                                                                              1. 1

                                                                                                The author may have edited their post since you read it. Rust is there now in the post-1996 list.

                                                                                              2. 3

                                                                                                I find this kind of casual dismissal that constantly gets voted up on this site really disappointing.

                                                                                                1. 2

                                                                                                  It’s unclear to me how adding iOS or Android to the list would make much of a change to the author’s point.

                                                                                                  1. 3

                                                                                                    Considering “Windows” was on the list of pre-1996 tech, I think iOS/Android/touch-based interfaces in general would be a pretty fair inclusion of post-1996 tech. My point is that this seems like an arbitrary grab bag of things to include vs not include, and 1996 seems like a pretty arbitrary dividing line.

                                                                                                    1. 2

                                                                                                      I don’t think the list of specific technologies has much of anything to do with the point of how the technologies themselves illustrate bigger ideas. The article is interesting because it makes this point, although I would have much rather seen a deeper dive into the topic since it would have made the point more strongly.

                                                                                                      What I get from it, and having followed the topic for a while, is that around 1996 it became feasible to implement many of the big ideas dreamed up before due to advancements in hardware. Touch-based interfaces, for example, had been tried in the 60s but couldn’t actually be consumer devices. When you can’t actually build your ideas (except in very small instances) you start to build on the idea itself and not the implementation. This frees you from worrying about the details you can’t foresee anyway.

                                                                                                      Ideas freed from implementation and maintenance breed more ideas. So there were a lot of them from the 60s into the 80s. Once home computing really took off with the Internet and hardware got pretty fast and cheap, the burden of actually rolling out some of these ideas caught up with them. Are they cool and useful? In many cases, yes. They also come with side effects and details not really foreseen, which is expected. Keeping them going is also a lot work.

                                                                                                      So maybe this is why it feels like more radical ideas (like, say, not equating programming environments with terminals) don’t get a lot of attention or work. But if you study the ideas implemented in the last 25 years, you see much less ambition than you do before that.

                                                                                                      1. 2

                                                                                                        I think the Twitter thread @Hail_Spacecake posted pretty much sums up my reaction to this idea.

                                                                                                    2. 2

                                                                                                      I think a lot of people are getting woosh’d by it. I get the impression he’s talking from a CS perspective. No new paradigms.

                                                                                                      1. 3

                                                                                                        Most innovation in constraint programming languages and all innovation in SMT are after 1996. By his own standards, he should be counting things like peer-to-peer and graph databases. What else? Quantum computing. Hololens. Zig. Unison.

                                                                                                        1. 2

                                                                                                          Jonathan is a really jaded guy with interesting research ideas. This post got me thinking a lot but I do wish that he would write a more thorough exploration of his point. I think he is really only getting at programming environments and concepts (it’s his focus) but listing the technologies isn’t the best way to get that across. I doubt he sees SMT solvers or quantum computing as something that is particularly innovative with respect to making programming easier and accessible. Unfortunately that is only (sort of) clear from his “human programming” remark.

                                                                                                      2. 2

                                                                                                        It would strengthen it. PDAs - with touchscreens, handwriting recognition (what ever happened to that?), etc. - were around in the 90s too.

                                                                                                        Speaking as someone who only reluctantly gave up his Palm Pilot and Treo, they were in some ways superior, too. Much more obsessive focus on UI latency - especially on Palm - and far less fragile. I can’t remember ever breaking a Palm device, and I have destroyed countless glass screened smartphones.

                                                                                                        1. 3

                                                                                                          The Palm Pilot launched in 1996, the year the author claims software “stalled.” It was also created by a startup, which the article blames as the reason for the stall: “There is no room for technology invention in startups.”

                                                                                                          They also didn’t use touch UIs, they used styluses: no gestures, no multitouch. They weren’t networked, at least not in 1996. They didn’t have cameras (and good digital cameras didn’t exist, and the ML techniques that phones use now to take good pictures hadn’t even been conceived of yet). They couldn’t play music, or videos. Everything was stored in plaintext, rather than encrypted. The “stall” argument, as if everything stopped advancing in 1996, just doesn’t really hold much water to me.

                                                                                                          1. 1

                                                                                                            The Palm is basically a simplified version of what already existed at the time, to make it more feasible to implement properly.

                                                                                                    1. 12

                                                                                                      Please, please, please don’t comment code as per the “good” code example. In my opinion, comments should explain the “why” or other non obvious design decisions instead of merely reiterating what the code is doing.

                                                                                                      1. 6

                                                                                                        Yeah, the example is uncompelling and I would not look kindly on it in a review.

                                                                                                        That said, project like GCC have comments of the “it does this” nature and they are immensely useful because it is usually not obvious what the code is, in fact, doing. The reasons for this are legion, but even something seemingly simple benefits from basic comments because you often end up jumping into this code from something that is essentially unrelated. Without those kinds of comments, you would end up spending an incredible amount of time getting to know the module (which is often very complicated) just to get what tends to be tangential, and important, information.

                                                                                                      1. 16

                                                                                                        Not to be glib, but I don’t.

                                                                                                        If I have personal stuff going on that requires more than a page in notebook, I probably have too much going on.

                                                                                                        1. 3

                                                                                                          Totally with you. I don’t either.

                                                                                                          It’s not about being glib or pessimistic. For me it’s about my sanity. The list of things I want to do and haven’t will always be longer than the list of things I have. In a perfect world, I would have PhD, be the author of a wildly popular programming language or library or something, and be highly regarded in my field. There’s nothing wrong with being down to earth and realizing at this point in my life the odds of any of those happening is slim. I’m okay with that.

                                                                                                          Now on a positive note, that doesn’t mean I’m not constantly out to do something I haven’t. Or trying to accomplish a new goal in my life. I just try not look back.

                                                                                                          1. 1

                                                                                                            For me having more than that page is a sign that I am not working on the right thing, or not in the right way. For work and other commitments it is different, but personal “goals” and projects are the only place where you can do things purely driven by intrinsic motivation. If you have enough of that, you don’t need a system to guide you, or a system that helps staying disciplined. It is very important for my mental health to have a place in my life where that can happen.

                                                                                                            So, if I have a day off and I can do whatever I want, what should I work on? On whatever I feel like. It’s that simple.