Threads for qznc

  1. 4

    My tip: Instead of cp or scp, use rsync. It is more efficient for large files. It will not modify the target if it already exists. It will update the target if it exists but has different contents. It even uses the same parameter ordering as cp (although it is wrong imho).

    1. 4

      On a higher level, there is two main questions in discussions:

      1. What exactly do we have to decide? This is the difference between “which framework is best?” and “we are about to create a new service, should we use the same framework as we usually do?” This puts the discussion into a context. Without context you can argue past each other forever because everybody assumes a different context with lots of hidden assumptions.

      2. What are the relevant aspects for the decision? You can compare web frameworks according to their parts and how mature/builtin/supported they are. These are usually not the decisive aspects though. Programming language matters though because your team usually is only really competent in one or two of them, so every other language would come with a big risk and learning effort. Sometimes you want to pick the long-term best option. Sometimes the next deadline is more important.

      Without answering these two questions, precision in your discussions will be in vain. If you know the answer, then this article has good tactical advice.

      1. 2

        This doesn’t actually solve the hard problem, which is estimating the financial cost of a less-than-optimal implementation.

        1. 3

          I think the framework is there: you could calculate a probability distribution of financial cost based on the number of distinct issues and their microdefect amounts. Optimally you’d also use a probability distribution for the cost of a single defect instead of just using an average.

          For an organization well-versed in risk management this might just work. But without understanding the concept of probabilistic risk I don’t believe the tradeoffs in implementation (and design) can be managed.

          The article seems to focus on just the expected value of microdefects. This might be enough for some decisions, but it’s not a good way to conceptualize “technical debt”.

          1. 3

            One interesting implication is that if we can estimate the costs of different violations, we can estimate the cost-saving of tools that prevent them.

            For example, if “if without an else” is $0.01, then a linter that prevents that or a language where conditionals are expressions rather than statements automatically saves you a dollar per 100 conditionals.

            1. 2

              you could calculate a probability distribution of financial cost based on the number of distinct issues and their microdefect amounts

              My point is, we can’t do that because we don’t know what the average cost of a defect is, and we have no way of finding out.

              1. 2

                I think we do (certainly I have some internal numbers for some of these things) the thing that we don’t know is the cost distribution of defects. For example, the cost of a security vulnerability that allows arbitrary code execution is significantly higher than the cost of a bug that causes occasional and non-reproduceable crashes on 0.01% of installs. A bug that causes non-recoverable data corruption to 0.01% of users is somewhere in the middle. We also don’t have a good way of mapping the probability of any kind of bug to something in the source code at any useful granularity (we can say, for example, that the probability of a critical vulnerability in a C codebase is higher than in a modern C++ one, but that doesn’t help us target the things to fix in the C codebase and rewriting it entirely is prohibitively expensive in the common case).

                1. 1

                  What sorts of things do you have numbers for, if you can share? I have heard of people estimating costs, but only for performance issues when you can map it to machine usage costs pretty easily, so I’d be interested in other examples.

                2. 1

                  It’s true we can’t know the distribution or the average exactly. But if you measured the cost of each found defect after it’s fixed, you could make a reasonable statistical model after N=1000 or so. And note that we do know lower and upper bounds for the financial cost of a defect: the cost must typically be between zero and the cost of bankruptcy.

                  1. 4

                    if you measured the cost of each found defect after it’s fixed, you could make a reasonable statistical model after N=1000 or so

                    You are also assuming the hard part. How are you measuring the cost of a defect?

                    1. 1

                      It depends a lot on the business you are in. For Open Source it is hopeless because you don’t know how many users you even have. My work is in automotive, where we can count the cost for customer defects quite well. Probably better than our engineering costs in general.

                      1. 1

                        we can count the cost for customer defects quite well

                        Are these software defects or hardware defects? As a followup, if they are software defects, are they the sort of defects that would be described as “tech debt” or as outright bugs?

                        1. 1

                          Yes, the classification is still tricky. Assume we have a defect. We trace it down to a simple one line change in the software and fix it. Customer happy again. They get a price reduction for the hassle. That amount plus the effort invested for debugging and fixing is the cost of the defect.

                          Now we need to consider what technical debt could have encouraged writing that bug: Maybe a variable involved violated the naming convention so the bug was missed during code review? Maybe the cyclomatic complexity of the function is too high? Maybe the Doxygen comment was incomplete? Maybe the line was not covered by a unit test? For all such possible causes, you can now adapt the microdefect cost slightly upwards.

                          1. 1

                            That’s an interesting idea. And then microdefects would work well, because you average out differences in like how much it costs a customer to be happy that don’t have much to do with the bug itself.

                            Do you have a similar process for bugs that don’t affect customers, or correct but inefficient code implementations?

                            1. 1

                              You are thinking of those “phew, glad we found that before anyone noticed” incidents, I assume. The cost is only the effort here.

                              We have something similar. Sometimes we find a defect which has already shipped but apparently the customer (OEM) nor the users seem to have noticed. Then there is a risk assessment, where tradeoffs are considered:

                              • How many users do we expect to notice it? Mostly depends on how many users there are and how often the symptoms occur.
                              • How severe is the impact? If is a safety risk, the fixing is mandatory.
                              • How much will it cost to fix it? Again, the more users there are the higher the cost.
                              • How visible is the fix? If you bring a modern car to the yearly inspection, chances are that quite a few bugfixes are installed to various controllers without you noticing it.

                              You can estimate anything but of course the accuracy and precision can get out of hand.

            1. 24

              I’m sympathetic to the goal of making reasoning about software defects more insightful to management, but I feel that ‘technical debt’ as a concept is very problematic. Software defects don’t behave in any way like debt.

              Debt has a predictable cost. Software defects can have zero costs for decades, until a single small error or design oversight creates millions in liabilities.

              Debt can be balanced against assets. ‘Good’ software (if it exists!) doesn’t cancel out ‘Bad’ software; in fact, it often amplifies the effects of bad software. Faulty retry logic on top of a great TCP/IP stack can turn into a very damaging DoS attack.

              Additive metrics like microdefects or bugs per line of code might be useful for internal QA processes, but especially when talking to people with a financial background, I’d avoid them, and words like ‘debt’, like the plague. They need to understand software used by their organization as a collection of potential liabilities.

              1. 11

                Debt has a predictable cost. Software defects can have zero costs for decades, until a single small error or design oversight creates millions in liabilities.

                I think this you’ve nailed the key flaw with the “technical debt” metaphor here. It strongly supports this “microdefect” concept, explicitly by analogy to microCOVID, which the piece doesn’t mention is named for micromort. The analogy works really well to your point: these issues are very low cost and then sudden, potentially catastrophic failure. Maybe “microcrash” or “microoutage” would be a clearer term; I’ve seen “defect” used for pretty harmless issues like UI typos.

                The piece is a bit confusing by relying on the phrase ‘technical debt’ while trying to supplant it, it’d be stronger if it only used it once or twice to argue its limitations.

                We’ve seen papers on large-scale analyses of bugfixes on GitHub. Feels like that route of large-scale analysis could provide some empirical justification for assessing values of different microdefects.

                1. 1

                  I’m very surprised by the microcovid.org website not mentioning their inspiration from the micromort.

                  1. 1

                    It’s quite possible they invented the term “microCOVID” independently. “micro-” is a well-known prefix in science.

                  2. 1

                    One thing I think focusing on defects fails to capture is the way “tech debt” can slow down development,even if it’s not actually resulting in more defects. If a developer wastes a few days flailing because the didn’t understand something crucial about a system e.g. because it was undocumented, then that’s a cost even if it doesn’t result in them shipping bugs.

                    Tangentially relatedly, the defect model also implicitly assumes a particular behavior of the system is either a bug or not a bug. Often things are either subjective or at least a question of degree; performance problems often fall into this category, as do UX issues. But I think things which cause maintenance problems (lack of docs, code that is structured in a way that is hard to reason about, etc) often work similarly, even if they don’t directly manifest in the runtime behavior of the system.

                    1. 1

                      Microcovids and micromorts at least work out in the aggregate; the catastrophic failure happens to the individual, i.e. there’s no joy in knowing the chance of death is one in a million if you happen to be that fatality.

                      Knowing the number of code defects might give us a handle on the likelihood of one having an impact, but not on the size of its impact.

                    2. 3

                      Actually, upon re-reading, it seems the author defines technical debt purely in terms of code beautification. In that case the additive logic probably holds up well enough. But since beautiful code isn’t a customer-visible ‘defect’, I don’t understand how monetary value could be attached to it.

                      1. 3

                        I usually see “tech debt” used to describe following the “no design” line on https://www.sandimetz.com/s/012-designStaminaGraph.gif past the crossing point. The idea is that the longer you keep on this part of the curve, the harder it becomes to create or implement any design, and the ability to maintain the code slows.

                        1. 1

                          I think this is the key:

                          For example, your code might violate naming conventions. This makes the code slightly harder to read and understand which increases the risk to introduce bugs or miss them during a code review.

                          Tech debt so often leads to defects, they become interchangeable.

                          1. 1

                            To me, this sounds like a case of the streetlight effect. Violated naming conventions are a lot easier to find than actual defects, so we pretend fixing one helps with the other.

                        2. 3

                          I think it’s even simpler than that: All software is a liability. The more you have of it and the more critical it is to your business, the bigger the liability. As you say, it might be many years before a catastrophic error occurs that causes actual monetary damage, but a sensible management should have amortized that cost over all the preceding years.

                          1. 1

                            I think it was Dijkstra who said something like “If you want to count lines of code, at least put them on the right side of the balance sheet.”

                          2. 2

                            Debt has a predictable cost

                            Only within certain bounds. Interest rates fluctuate and the interest rate that you can actually get on any given loan depends on the amount of debt that you’re already carrying. That feels like quite a good analogy for technical debt:

                            • It has a certain cost now.
                            • That cost may unexpectedly jump to a significantly higher cost as a result of factors outside your control.
                            • The more of it you have, the more expensive the next bit is.
                            1. 1

                              especially when talking to people with a financial background, I’d avoid them, and words like ‘debt’, like the plague

                              Interesting because Ward Cunningham invented the term when he worked as a consultant for people with a financial background to explain why code needs to be cleaned up. He explicitly chose a term they knew.

                              1. 1

                                And he didn’t choose very wisely. Or maybe it worked at the time if it got people to listen to him.

                            1. 2

                              I’m among those people who repeatedly claim that atomic commits are the one advantage of monorepos. This article tells me it isn’t a strong reason because this ability is rarely or never used. Maybe, I’m not a big fan of monorepos anyways.

                              I agree with the author that incremental changes should be preferred for risk mitigation. However, what about changes which are not backwards-compatible? If you only change the API provider, then all users are broken. You cannot do this incrementally.

                              Of course, changes should be backwards compatible. Do Google, Facebook, and Microsoft achieve this? Always backwards compatible?

                              1. 6

                                You’d rewrite that single non-backwards compatible change as a series of backwards compatible ones, followed by a final non-backwards compatible change once nobody is depending on the original behavior any more. I’d expect it to be possible to structure pretty much any change in that manner. Do you have a specific counter-example in mind?

                                1. 5

                                  We used to have an internal rendering tool in a separate repo from the app (rendering tests were slow).

                                  The rendering tool ships with the app! There’s no version drift or anything.

                                  When it was a separate repo you’d have one PR with the changes to the renderer, another to the app, you had to cross-reference both (lot easier to check changes when you also see usage changes by consumers), then merge on one side, then update the version on the other side, and only then do you end up with a nice end-to-end change

                                  It’s important to know how to make basically any change backwards compatible, but the costs of doing that compared to the easy change is extremely high and error prone IMO. Especially when you have access to all the potential consumers

                                  1. 4

                                    That approach definitely works, but it doesn’t come for free. On top of the cost of having to roll out all the intermediate changes in sequence and keep track of when it’s safe to move on, one cost that I see people overlook pretty often is that the temporary backward compatibility code you write to make the gradual transition happen can have bugs that aren’t present in either the starting or ending versions of the code. Worse, people are often disinclined to spend tons of effort writing thorough automated tests for code that’s designed to be thrown away almost immediately.

                                    1. 3

                                      You don’t have to, at least if you use submodules. You can commit a breaking change to a library, push it, run CI on it (have it build on all supported platforms and run its test suite, and so on). Then you push a commit to each of the projects that consumes the library that atomically updates the submodule and updates all callers. This also reduces the CI load because you can test the library changes and then the library-consumer changes independently, rather than requiring CI to completely pass all tests at once.

                                      1. 3

                                        I’m working in an embedded field where microcontrollers imply tight resource constraints. That often limits how many abstractions you can introduce for backwards-compatibility.

                                        A simple change could be a type which has “miles” and then “kilometers”. If you extend the type (backwards compatible) it becomes larger. Multiplied by many uses all across the system that can easily blow up to a few kilobytes and cross some limits.

                                        Another example: A type change meant that an adapter had to be introduced between two components where one used the old and the other the new type. Copying a kilobyte of data can already cross a runtime limit.

                                        I do admit that microcontrollers are kinda special here and in other domains the cost of abstractions for backwards-compatibility is usually negligible.

                                    1. 1

                                      in C++ the keyword const does not completely refer to immutability. For instance, you can use the keyword const in a function prototype to indicate you won’t modify it, but you can pass a mutable object as this parameter.

                                      I don’t know C++ enough, but doesn’t const makes object itself immutable, not only variable holding it? Unlike most languages, i.e. javascript, where const only makes variable constant, not its value. I.e. you can’t call non-const methods on this object, you can’t modify its fields. At least if it’s not pointer to object, seems that for pointers it’s complicated. I thought this works almost the same way as in Rust, where you can’t modify non-mut references.

                                      1. 7

                                        I don’t know C++ enough, but doesn’t const makes object itself immutable, not only variable holding it?

                                        It’s C++ so the answer to any question is ‘it’s more complicated than that’. The short answer is that const reference in C++ cannot be used to modify the object, except when it can.

                                        The fact that the this parameter in C++ is implicit makes this a bit difficult to understand. Consider this in C++:

                                        struct Foo
                                        {
                                           void doAThing();
                                        };
                                        

                                        This is really a way of writing something like:

                                        void doAThing(Foo *this);
                                        

                                        Note that this is not const-qualified and so you cannot implicitly cast from a const Foo* to a Foo*. Because this is implicit, C++ doesn’t let you put qualifiers on it, so you need to write them on the method instead:

                                        struct Foo
                                        {
                                           void doAThing() const;
                                        };
                                        

                                        This is equivalent to:

                                        void doAThing(const Foo *this);
                                        

                                        Now this works with the same overload resolution rules as the rest of C++: You can call this method with a const Foo* or a Foo*, because const on a parameter just means that the method promises not to mutate the object via this reference. There are three important corner cases here. First, consider a method like this:

                                        struct Foo
                                        {
                                           void doAThing(Foo *other) const;
                                        };
                                        

                                        You can call this like this:

                                        Foo f;
                                        const Foo *g = &f;
                                        g->doAThing(&f);
                                        

                                        Now the method has two references to f. It can mutate the object through one but not the other. The second problem comes from the fact that const is advisory and you can cast it away. This means that it’s possible to write things in C++ like this:

                                        struct Foo
                                        {
                                           void doAThing();
                                           void doAThing() const
                                           {
                                             const_cast<Foo*>(this)->doAThing();
                                           }
                                        };
                                        

                                        The const method forwards to the non-const one, which can mutate the class (well, not this one because it has no state, but the same is valid in a real thing). The second variant of this is the keyword mutable. This is intended to allow C++ programmers to write logically immutable objects that have internal mutability. Here’s a trivial example:

                                        struct Foo
                                        {
                                           mutable int x = 0;
                                           void doAThing() const
                                           {
                                             x++;
                                           }
                                        };
                                        

                                        Now you can call doAThing with a const pointer but it will mutate the object. This is intended for things like internal caches. For example, clang’s AST needs to convert from C++ types to LLVM types. This is expensive to compute, so it’s done lazily. You pass around const references to the thing that does the transformation. Internally, it has a mutable field that caches prior conversions.

                                        Finally, const does not do viewpoint adaptation, so just because you have a const pointer to an object does not make const transitive. This is therefore completely valid:

                                        struct Bar
                                        {
                                            int x;
                                        };
                                        struct Foo
                                        {
                                          Bar *b;
                                          void doAThing() const
                                          {
                                            b->x++;
                                          }
                                        };
                                        

                                        You can call this const method and it doesn’t modify any fields of the object, but it does modify an object that a field points to, which means it is logically modifying the state of the object.

                                        All of this adds up to the fact that compilers can do basically nothing in terms of optimisation with const. The case referenced from the talk was of a global. Globals are more interesting because const for a global really does mean immutability, it will end up in the read-only data section of the binary and every copy of the program / library running will share the same physical memory pages, mapped read-only[1]. This is not necessarily deep immutability: a const global can contain pointers to non-const globals and those can be mutated.

                                        In the specific example, that global was passed by reference and so determining that nothing mutated it required some inter-procedural alias analysis, which apparently was slightly deeper than the compiler could manage. If Jason had passed the sprite arrays as template parameters, rather than as pointers, he probably wouldn’t have needed const to get to the same output. For example, consider this toy example:

                                        namespace 
                                        {
                                          int fib[] = {1, 1, 2, 3, 5};
                                        }
                                        
                                        int f(int x)
                                        {
                                            return fib[x];
                                        }
                                        

                                        The anonymous namespace means that nothing outside of this compilation unit can write to fib. The compiler can inspect every reference to it and trivially determine that nothing writes to it. It will then make fib immutable. Compiled with clang, I get this:

                                                .type   _ZN12_GLOBAL__N_13fibE,@object  # @(anonymous namespace)::fib
                                                .section        .rodata,"a",@progbits
                                                .p2align        4
                                        _ZN12_GLOBAL__N_13fibE:     # (anonymous namespace)::fib
                                                .long   1                               # 0x1
                                                .long   1                               # 0x1
                                                .long   2                               # 0x2
                                                .long   3                               # 0x3
                                                .long   5                               # 0x5
                                                .size   _ZN12_GLOBAL__N_13fibE, 20
                                        

                                        Note the .section .rodata bit: this says that the global is in the read-only data section, so it is immutable. That doesn’t make much difference, but the fact that the compiler could do this transform means that all other optimisations can depend on fib not being modified.

                                        Explicitly marking the global as const means that the compiler doesn’t need to do that analysis, it can always assume that the global is immutable because it’s UB to mutate a const object (and a compiler is free to assume UB doesn’t happen. You could pass a pointer to the global to another compilation unit that cast away the const and tried to mutate it, and on a typical OS that would then cause a trap. Remember this example the next time someone says compilers shouldn’t use UB for optimisations: if C/C++ compilers didn’t depend on UB for optimisation then they couldn’t do constant propagation from global constants without whole-program alias analysis.

                                        For anything else, the guarantees that const provides are so weak that they’re useless. Generally, the compiler can either see all accesses to an object (in which case it can infer whether it’s mutated and get more accurate information than const) or it can’t see all accesses to an object (and so must assume that one of them may cast away const and mutate the object).

                                        [1] On systems with MMUs and sometimes it needs to contain so may actually be mutable unless you’ve linked with relro support.

                                        1. 1

                                          No, you might have D in mind where const is transitive.

                                        1. 8

                                          Wrong title. This post is fairly interesting and well written, but it doesn’t really explain why we need build systems. Instead, it tells us what build systems do. And while I do see the author trying to push us towards widely used build systems such as CMake, he offers little justification. He mentions that most developers seem to think CMake make them suffer, but then utterly fails to address the problem. Are we supposed to just deal with it?

                                          For simple build system like GNU Make the developer must specify and maintain these dependencies manually.

                                          Not quite true, there are tricks that allows GNU Make to keep track of dependencies automatically, thanks to the -M option from GCC and Clang. Kind of a pain in the butt, but it can be done.

                                          A wildcard approach to filenames (e.g. src/*.cpp) superficially seems more straightforward as it doesn’t require the developer to list each file allowing new files to be easily added. The downside is that the build system does not have a definitive list of the source code files for a given artefact, making it harder to track dependencies and understand precisely what components are required. Wildcards also allow spurious files to be included in the build – maybe an older module that has been superseded but not removed from the source folder.

                                          First, tracking dependencies should be the build system’s job. It can and has been done. Second, if you have spurious files in your source tree, you should remove them. Third, if you forget to remove an obsolete module, I bet my hat you also forgot to remove it from the list of source files.

                                          Best practice says to list all source modules individually despite the, hopefully minor, extra workload involved when first configuring the project or adding additional modules as the project evolves.

                                          In my opinion, best practice is wrong. I’ll accept that current tools are limited, but we shouldn’t have to redundantly type out dependencies that are right there in the source tree.


                                          That’s it for the hate. Let’s talk solutions. I personally recommend taking a look at SHAKE, as well as the paper that explains the theory behind it (and other build systems as well). I’ve read the paper, and it has given me faith in the possibility of better, simpler build systems.

                                          1. 3

                                            We need to distinguish between build execution (ninja) and build configuration (autotools). The paper is about the execution. Most of complexity is in the configuration. (The paper is great though 👍)

                                            1. 2

                                              I have looked at SHAKE and its paper before, but I am curious: what would you like to see in a build system?

                                              I ask because I am building one. 1

                                              1. 4

                                                I’m a peculiar user. What I want (and build) is simple, opinionated software. This is the Way.

                                                I don’t need, nor want, my build system to cater to God knows how many environments, like CMake does. I don’t care that my dependencies are using CMake or the autotools. I don’t seek compatibility with those monstrosities. If it means I have to rewrite some big build script from scratch, so be it. Though in all honesty, I’m okay with just calling the original build script and using the artefacts directly.

                                                I don’t need, nor want, my build system to treat stuff like unit testing and continuous integration specially. I want it to be flexible enough that I can generate a text file with the test results, or install & launch the application on the production server.

                                                I want my build system to be equally useful for C, C++, Haskell, Rust, LaTeX, and pretty much anything. Just a thing that uses commands to generate missing dependencies. And even then most commands can be as simple as calling some program. They don’t have to support Bash syntax or whatever. I want multiple targets and dynamic dependencies. And most of all, I want a strong mathematical foundation behind the build system. I don’t want to have to rebuild the world “just in case”.


                                                Or, I want a magical build system where I just tell it where’s the entry point of my program, and it just fetches and builds the transitive extension of the dependencies. Which seems possible on some closed ecosystems like Rust or Go. And I want that build system to give me an easy way to run unit tests as part of the build, as well as installing my program, or at least giving me installation scripts. (This is somewhat contrary to the generic build system above.)

                                                That said, if the generic build system can stay simple and is easy enough to use, I probably won’t need the “walled garden” version.

                                                1. 2

                                                  Goodness; you know exactly what you want.

                                                  Your comment revealed some blind spots in my current design. I am going to have to go back to the drawing board and try again.

                                                  I think a big challenge would be to generate missing dependencies for C and C++, since files can be laid out haphazardly with no rhyme or reason. However, for most other languages, which have true module systems, that may be more possible.

                                                  Thank you.

                                              2. 2

                                                The real reason why globbing source files is unsound, at least in the context of CMake:

                                                Note: We do not recommend using GLOB to collect a list of source files from your source tree: If no CMakeLists.txt file changes when a source is added or removed, then the generated build system cannot know when to ask CMake to regenerate.

                                                I heard the same reason is why Meson doesn’t support it.

                                                1. 2

                                                  Oh, so it’s a limitation of the tool, not something we actually desire… Here’s what I think: such glob patterns would typically be useful at link time, where you want to have the executable (or library) to aggregate all object files. Now the list of object files depend on the list of source files, which itself depends on the result of the glob pattern.

                                                  So to generate the program, the system would fetch the list of object files. That list depends on the list of source files, and should be generated whenever the list of source file changes. As for the list of source files, well, it changes whenever we actually add or remove a source file. As for how we should detect it, well… this would mean generating the list anew every time, and see if it changed.

                                                  Okay, so there is one fundamental limitation here: if we have many many files in the project, using glob patterns can make the build system slower. It might be a good idea in this case to fix the list of source files. Now, I still want a script that lists all available source files so I don’t have to manually add it every time I add a new file. But I understand the rationale better now.

                                                  1. 1

                                                    Oh, so it’s a limitation of the tool, not something we actually desire… Here’s what I think: such glob patterns would typically be useful at link time, where you want to have the executable (or library) to aggregate all object files. Now the list of object files depend on the list of source files, which itself depends on the result of the glob pattern.

                                                    So to generate the program, the system would fetch the list of object files. That list depends on the list of source files, and should be generated whenever the list of source file changes. As for the list of source files, well, it changes whenever we actually add or remove a source file. As for how we should detect it, well… this would mean generating the list anew every time, and see if it changed.

                                                    Okay, so there is one fundamental limitation here: if we have many many files in the project, using glob patterns can make the build system slower. It might be a good idea in this case to fix the list of source files. Now, I still want a script that lists all available source files so I don’t have to manually add it every time I add a new file. But I understand the rationale better now.

                                                  2. 1

                                                    First, tracking dependencies should be the build system’s job. It can and has been done.

                                                    see: tup

                                                    Second, if you have spurious files in your source tree, you should remove them.

                                                    Conditionally compiling code on the file level is one of the best ways to do it, especially if you have some kind of plugin system (or class system). It’s cleaner that ifdefing out big chunks of code IMO.

                                                    Traditionally, the reason has been because if you want make to rebuild your code correctly when you remove a file you have to do something like

                                                    OBJS := $(wildcard *.c)
                                                    
                                                    .%.var: FORCE
                                                    	@echo $($*) | cmp - $@ || echo $($*) > $@
                                                    
                                                    my_executable: $(OBJS) .OBJS.var
                                                    	$(CC) $(LDLIBS) -o $@ $(OBJS) $(LDFLAGS)
                                                    

                                                    which is a bit annoying, and definitely error-prone.

                                                    Third, if you forget to remove an obsolete module, I bet my hat you also forgot to remove it from the list of source files.

                                                    One additional reason is that it can be nice when working on something which hasn’t been checked in yet. Imagine that you are working on adding the new Foo feature, which lives in foo.c. If you then need to switch branches, git stash and git checkout will leave foo.c lying around. By specifying the sources you want explicitly, you don’t have to worry about accidentally including it.

                                                    1. 1

                                                      Conditionally compiling code on the file level is one of the best ways to do it, especially if you have some kind of plugin system (or class system). It’s cleaner that ifdefing out big chunks of code IMO.

                                                      Okay, that’s a bloody good argument. Add to that the performance implication of listing every source file every time you build, and you have a fairly solid reason to maintain a static list of source files.

                                                      Damn… I guess I stand corrected.

                                                  1. 2

                                                    The first trick here is that function parameters evaluation order is unspecified, meaning that new Widget might be called, then priority(), then the value returned by new Widget is passed to std::shared_ptr(…)

                                                    I know the order of evaluation of function parameters is undefined, but I’ve never heard of the compiler being allowed to skip around between multiple function calls evaluating a parameter here, a parameter there… I don’t actually own EffC++; can someone verify this is true?

                                                    In other words, my understanding is that the compiler will first fully evaluate one parameter of processWidget, then the other. The order may be unspecified, but after the “new” operator we know the next call will be to the shared_ptr constructor. Thus there’s no chance of a leak … as I understand it.

                                                    1. 6

                                                      I was surprised at first, but after double-checking, the boost docs and Herb Sutter back it up. Wild.

                                                      1. 3

                                                        Wild indeed. I still don’t want to admit this is true, 🙈 so I’m desperately seizing on the disclaimer at the top of the Sutter article:

                                                        This is the original GotW problem and solution substantially as posted to Usenet. See the book More Exceptional C++ (Addison-Wesley, 2002) for the most current solution to this GotW issue. The solutions in the book have been revised and expanded since their initial appearance in GotW. The book versions also incorporate corrections, new material, and conformance to the final ANSI/ISO C++ standard.

                                                        The article isn’t dated, but it must be from before 2002. I wonder if this part of the C++ spec has changed since then, considering how unintuitive this behavior is. 🤞🏻😬

                                                        1. 2

                                                          So C++ is non-strict?

                                                        2. 3

                                                          Imagine an arithmetic expression like (a+b)*(c+d). The two additions are independent and can be computed in parallel. If the CPU has multiple ALUs for the parallel computation and there are enough registers to hold the data, the compiler should interleave the evaluation to enable instruction-level parallelism.

                                                          Such an optimization can result in this „skipping around“.

                                                        1. 2

                                                          My very first FP language was Scala, which has first class objects. Granted this is one of the rare ones, but OO FP does exist.

                                                          1. 3

                                                            The article is all about how FP and OO co-exist without issue and there is no “vs”. I wouod go one step further and say there is no “OO language” or “FP language” – every language with a feature that can simulate a closure can describe both paradigms at will.

                                                            1. 2

                                                              Scala had the explicit goal to combine FP and OO nicely.

                                                            1. 18

                                                              As the leading architect of a project, I asked some developers what they thought of my „leading“ there. One suggestion was that I could have been more confident.

                                                              I believe it is a human thing to long for confident leaders. Developers are no exception. The „strong opinions, weakly held“ meme is a symptom. It isn’t generally good or bad.

                                                              With the developers we concluded that I was roughly as confident as the circumstances permitted.

                                                              1. 5

                                                                Oh yeah, definitely. I’ll add to that that people also want leaders with prestige (or high status if you will).

                                                                There is one negative interpretation and a positive one that I oscillate between:

                                                                1. People are bad with uncertainty, so it’s not received well if leadership says “we will do X, and it has a 75% chance of success”. Or worse: “we want X, but it’s not a strongly held opinion, feel free to disagree”

                                                                2. Part of leadership’s job is to create clarity and it’s necessary to just say “we’re sure about this decision, let’s go”. That doesn’t necessarily imply skewing the facts. But it helps tremendously to not have decision-makers that seem confused and fluffy and all over the place. Having insecure managers is terrible and not helpful at all.

                                                              1. 3

                                                                Something I really want is a UI (probably browser-based) that will let you type languages like pikchr on the left and render it on the right automatically.

                                                                I would use it for graphviz also, etc.

                                                                Does this already exist?

                                                                I wrote something sort of like this (motivated by R plotting) several years ago, but there are a bunch of things about it that aren’t great: https://github.com/andychu/webpipe

                                                                It uses inotify-tools connected to a “hanging GET” to auto-refresh on save.

                                                                1. 2
                                                                  1. 1

                                                                    Yes that’s the right idea! I think the button can replaced with a keyboard shortcut easily.

                                                                    I would like something that generalizes to any tool, maybe with some kind of CGI interface. It looks like this is in Java, and I suspect https://pikchr.org/home/pikchrshow is in Tcl since it’s from Dr. Hipp. I probably would hack on the Tcl version first, although Python or PHP would also work.

                                                                  2. 1

                                                                    Pikchr has an inbuilt side-by-side sandbox functionality which makes the editing experience a lot easier.

                                                                    1. 1

                                                                      Ah OK I see this, it’s pretty close to what I want. I would like to enter a shell command and use it for any such tool! It can probably be hacked up for that purpose

                                                                      https://pikchr.org/home/pikchrshow

                                                                  1. 17

                                                                    Am I missing something from this story?

                                                                    She told me she knew I was busy with work and the app helped make certain she was on my mind and continued to keep communication high between us.

                                                                    I thought the whole point of this is that you’re busy/distracted/mentally engaged/what have you with work, and thus she isnt on your mind?

                                                                    I’m also still not even sure what it’s supposed to do. What does “automate your text messages” even mean?

                                                                    Does it just send random non committal messages to people? Or canned responses to messages?

                                                                    For anything outside of the “I’m driving and will see this when I stop” type auto responses I don’t see how “automation” is actually useful?

                                                                    1. 5

                                                                      This starts conversations with people by sending the starting text, it doesn’t have the full conversation.

                                                                      Some people, especially those that grew up with ubiquitous phones in school, see texting frequently as how you show you care about someone. This ensures that if you haven’t sent a text or called in a while, you start something automated to jump start the conversation.

                                                                      1. 2

                                                                        Perhaps it’s not clear enough from the story, but, for my use case, it allows me to provide my significant other with a quick “bid for affection” without breaking my focus; a notification pops up and I simply swipe it away and the automation happens. In other cases where I want more of a connection or dialog, I will configure the app to remind me at more convenient times and ask more open-ended questions.

                                                                        1. 2

                                                                          So it is more of a texting reminder with builtin suggestions?

                                                                          The link to the app does not work for me and the article does not really describe the app itself.

                                                                          1. 2

                                                                            In a nutshell, yes. Depending on how you configure it, its behavior changes and It also handles other communication methods. The story purposefully focuses on my experience with indie app development and not the details of the app itself. The app is currently in beta, so it’s limited to 47 countries/regions to better support this, but feel free to let me know what country you are in or send me a message.

                                                                      1. 1

                                                                        LoR monorepo

                                                                        Originally „monorepo“ meant one repo for the whole company. Here and also at my company people use the term for „one project repo“ now. Is that common?

                                                                        It seems the term is getting useless like „big data“.

                                                                        1. 3

                                                                          I think the term generally means ‘one repo containing multiple things that could be built independently’. For example, there’s an LLVM monorepo that contains LLVM, clang, lld, libc++, and so on even though libc++ is a completely separable component and folks that work on it don’t need anything else from the LLVM repo.

                                                                        1. 12

                                                                          This is an advertisement for their git commit-based metrics service, which is about as awful as one can get.

                                                                          I get these emails constantly as I have an admin account on our enterprise Pluralsight account, and I manage a developer team, and honestly this idea of “let’s use commit metrics to judge your developers” is not something I think anyone would welcome. Let me share a recent email they sent:

                                                                          Hey Adam,

                                                                          You can’t lead to what you can’t measure, and every day Engineers create troves of data that are currently unused.

                                                                          Sales, marketing, and finance use data daily to refine their processes.

                                                                          However, your engineering team’s git history is an untapped goldmine of data that can help you manage your teams and communicate with external stakeholders. There’s almost limitless data in your Git history waiting to be used.

                                                                          Ready to advocate for your team with concrete metrics?

                                                                          I have some availability this week to connect, when works for you?

                                                                          Thanks,

                                                                          1. 5

                                                                            This reads like a horoscope. And within it are really questionable assertions. “A developer in the zone produces evenly spaced, similarly sized pull requests”. Do people actually believe this?

                                                                            1. 2

                                                                              If the opposite means giving devs an hour between each meeting to complete their work, being in the zone is a wonderful thing.

                                                                            2. 3

                                                                              Well, I do think there is valuable data in the git history but not to judge developers. For example, the files with the most commits or the most authors are hot spots and a refactoring should be considered.

                                                                              1. 1

                                                                                this idea of “let’s use commit metrics to judge your developers” is not something I think anyone would welcome

                                                                                Maybe welcomed by the ones doing the judging; the ones being judged, not so much. :)

                                                                                1. 1

                                                                                  I flirted with Codermetrics[1] after reading the O’Reilly book back in the day. I think it could genuinely be useful as a tool for self-improvement. Inevitably, though, it would be turned into a performance management tool and thence gamed or otherwise take on a life of its own.

                                                                                  [1] https://codermetrics.org/

                                                                                1. 2

                                                                                  Hm, i don’t see any good argument why it matters?

                                                                                  1. 1

                                                                                    One reasonable line of argument is that if categorical presentations are interesting, then concatenative languages give their family of possible grammars. This isn’t just theoretically interesting, but has been used; Compiling to Categories is a common recent paper to cite.

                                                                                  1. 2

                                                                                    Most of you have probably seen this by now but I’ll leave it here for those who haven’t.

                                                                                    Also…

                                                                                    1990s Pentium PC WWW

                                                                                    2000s Laptop Web 2.0

                                                                                    2010s Smart Phones Apps

                                                                                    2020s Wearables TBD

                                                                                    2030s Embeddables TBD

                                                                                    I’ve seen this table in 2000 and 2010 and now again in 2020. Each time the “wearables” is touted as next decade’s big thing. I think it’s something that we won’t be able to achieve before the year of Linux on the desktop :-).

                                                                                    Granted, people have been singing dirges for the personal computer since about that same time, too. First it was thin clients (were it not for that stupid slow-ass network!). Then it was phones and tablets (were it not for them simpletons whose work did not consist of forwarding emails and attending meetings). But, you know, if you predict things at a high enough rate, some of them are bound to come true.

                                                                                    1. 2

                                                                                      2020 smart watch, fitness armbands

                                                                                      They are not as dominant as the others though.

                                                                                      1. 1

                                                                                        I regularly take walks without my phone, wearing my cellular watch streaming audiobooks and podcasts to my wireless earbuds, responding to messages through the voice assistant. No “smartglasses” yet, but wearables are important today and a huge growth area.

                                                                                        Still, yeah, doesn’t feel like anywhere near the the impact of PCs or smartphones. Once glasses get here, I think it will.

                                                                                      1. 13

                                                                                        Give Ada a shot. It’s used in high-reliability contexts, is ISO-standardized, has been in development since the 70’s and constantly updated and has had a pointer ownership model long before Rust came and claimed to have invented it. Admittedly, it’s not as “cool” and your code looks “boring”, but is very well-readable. The type system is also very strong (you could, for instance, define a type that can only hold primes or a tuple type that can only contain non-equal tuples) and even though Ada is OOP, which I generally dislike, they’re doing it right.

                                                                                        Additional bonuses are a really strong static analyzer (GNAT prove, which allows you to statically (!) verify there are no data-races or exceptions in a given code based on the Ada Spark subset) and parallelism and concurrency built into the language (not some crate that changes every week).

                                                                                        Many claim that Ada was dead, but it’s actually alive and kicking. Many people are using it but are just not that vocal about it and just get work done. As we can already see in this thread alone, the Rust-evangelists love to spread their message, but if you ask me, Rust is doomed due to it’s non-standardization.

                                                                                        You wouldn’t build your house on quicksand (Rust), but bedrock (Ada).

                                                                                        1. 4

                                                                                          How is the web stack stuff in Ada? Database access? It seems very interesting, but the ecosystem might not be in place for this specific application at least.

                                                                                          1. 3

                                                                                            Learning Ada at university for concurrent and parallel systems course, and real-time and embedded, showed me that C being the default for those domains really was a mistake. Ada has a very expressive concurrency model, I haven’t seen anything like it anywhere else (I love Haskell’s concurrency features equally, but they are very different). The precision you can express with Ada is amazing; the example in our real-time course was defining a type which represented memory mapped registers, could precisely describer what every bit would mean, in one of (IIRC) 8 alternative layouts depending on what instruction was being represented, and the type could be defined to only exist at the 8 memory locations where these registers were mapped. To do the same in C requires doing things which can only be described as hacks, and don’t tell the system important things like never allocate these addresses, they’re used for something already. The world has lost a lot by not paying more attention to Ada and hating on it without knowing the first thing about it.

                                                                                            1. 2

                                                                                              Ada looks really interesting to me because of all the checks you do at compile time, ensuring your program is correct before even running it. It’s also much more readable than something like, say, Rust. I would love something in the middle of C and Ada, with lots of compile time checks and the flexibility of C.

                                                                                              1. 6

                                                                                                After two days of kicking the tires on Ada, I’ve had nearly every opinion I had about it broken in a good way. I’m baffled I’m already productive in a language that feels like I’m being paid to write extra words like a serial fiction writer, but every so often, there’s some super useful bug-preventing thing I can do in a few lines of Ada which would be prohibitive or impossible to do in other languages (e.g. dynamic predicates, modular types).

                                                                                                The compile time checks it uses by default are in the vein of “if it compiles, it probably works” like Haskell or Rust. Within the same program you can turn on more intricate compile time/flow checks by annotating parts of your program to use SPARK, which is a subset of Ada and can coexist in the same project with your other code. The best way to describe it is almost like being able to use extern C within C++ codebases, or unsafe blocks in Rust to change what language features the compiler allows. Except code seems safe by default, and SPARK is “This part is mission critical, and is written in a reduced subset of the language to assist verification: e.g. functions must be stateless, checking of dependency of function inputs/outputs, etc.

                                                                                                1. 4

                                                                                                  Let’s say I’m sold on this: what’s the best way to learn ada for - say - writing a web app or doing etl?

                                                                                                  1. 3

                                                                                                    I would do much like I do for any other language, throw some terms into Google and go from there. I’d download GNAT, play with some toy programs and maybe try out Ada Web Application.

                                                                                                    1. 1

                                                                                                      Right, I threw in some search terms but I was wondering if you had any insights beyond that. In particular, how do people discover ada packages?

                                                                                                      1. 2

                                                                                                        I’m sorry, I didn’t know if you being sarcastic. Sigh, the state of the internet these days.

                                                                                                        Honestly, I have no idea. I’m just googling around trying to figure stuff out and this language feels like crawling into the operator seat of an abandoned earthmover and wondering, “What does this level do?” I used to work on ships with life-or-death systems and Ada feels much along these lines and industrial (as from an industrial manufacturing or maritime environment, not a bureaucratic, office, or software based one). They don’t use a tool because it’s popular, they use it because it does the right thing within the technical specs, and can be easily documented and prevents mistakes because people’s lives depend on it.

                                                                                                        1. 3

                                                                                                          To answer my own question, it looks like there is a beta package manager and index that provides a jumping off point to find stuff https://alire.ada.dev/search/?q=Web

                                                                                                          1. 2

                                                                                                            Neat! I hadn’t found that yet.

                                                                                                            I poked around a bit last night and found that the Adacore Github account has a lot of things like unit testing (AUnit), an Ada language server, and a lot more than I thought would be there. My first major gripe is that gnattest isn’t part of GNAT community, and the AUnit links were broken, but I finally found it on that account. I still need to crawl through how the build system works and such if you’re not going through a package manager.

                                                                                                    2. 1

                                                                                                      If you have access to an Oracle installation, dive into your orgs stored procs. PL/SQL is Ada with a SELECT statement.

                                                                                                    3. 3

                                                                                                      Thanks for your detailed elaboration which I can only agree with!

                                                                                                      And on top of all those guarantees and safeguards, you can easily write parallel code (using tasks) that is ingrained into the language. I find this truly remarkable given it actually makes sense to write web applications in Ada because of that.

                                                                                                    4. 6

                                                                                                      Have you tried D? It works as a flexible language where it looks like Python. Yet you can tighten it up with annotations (const, safe, …) and the meta programming can do plenty of compile time checks like bounded numbers.

                                                                                                      1. 2

                                                                                                        I have thought about D, but it seems to fall right in the middle of lower and higher level languages. It lacks a niche, as I see it.

                                                                                                        I might be wrong though, I have yet to try it after all.

                                                                                                        Update: I checked it out, and it actually seems really interesting. I’ll try it out tomorrow.

                                                                                                      2. 3

                                                                                                        I had the same thought as you when I first looked into Ada, thinking that it may provide safety but at the cost of missing closeness to the machine. However, you can get really close to the machine (for example bit-perfect “structs” and types). On the other hand, yes, if you add to much of the flexibility C provides, you end up with possible pitfalls.

                                                                                                        1. 1

                                                                                                          That’s Pascal.

                                                                                                      1. 8

                                                                                                        Other fields of computer science don’t seem to have such a giant rift between the accomplishments of researchers and practitioners.

                                                                                                        I don’t think this is true. Academia is very very bad at transferring their learning to practice. I think there are a lot of reasons: the standard for conference papers seems to be “I wrote program and it definitely does based on no actual testing outside of the people who wrote it”, a format that is extraordinarily useless to other researchers except as a way of learning of someone who might be working on that kind of thing, and completely useless to practitioners.

                                                                                                        All of the tools mentioned here have wild claims attached, but they boil down to being good for the one use they were designed to be tested on - with the exception of the excel case study.

                                                                                                        Specific to this area of endeavor there is also the problem that industry is intensely bad at making use of tools, ideas, skills or professions that they have not encountered before. This comes from both software engineering arrogance (“we’re so smart we know everything”) and businesses being pennywise (“we already pay so much for engineers we can’t afford to also make them more efficient”).

                                                                                                        1. 7

                                                                                                          It varies a lot by subfield – I’d say that “systems” is sort of adjacent to software engineering tools, and a lot of those technologies have been commercialized. (e.g. think USENIX papers)

                                                                                                          VMWare came out of Stanford (there is a foundational paper about x86 virtualization). Spark and Mesos came out of Berkeley.

                                                                                                          Margo Seltzer started companies around BerkeleyDB. Stonebraker started a whole bunch of RDBMS companies, which led to Postgres being open sourced. I think NFS came from academic a long time ago.

                                                                                                          Anything regarding performance tends to get commercialized too.

                                                                                                          I believe my former boss wrote the original paper on Control Flow Integrity and that landed in Clang a few years ago. Although there was a big time gap, and he “funded” one of his reports implement it in Clang. Maybe it wouldn’t have been implemented without that, but it looks like there are other non open source implementations too:

                                                                                                          https://en.wikipedia.org/wiki/Control-flow_integrity

                                                                                                          Software engineering tools tend to be commercialized less, but they’re by no means the only subfield of academia that doesn’t get put into practice.

                                                                                                          I guess I would say “putting CS into practice” requires economic motivations. All the examples above have economic motivations. Software engineering tools not so much unfortunately.

                                                                                                          JetBrains, Atlassian, and Github are exceptions to the rule that developer tools don’t make money. But I would say that JetBrains and Github are concentrated on UI. I’d say that a lot of tools require 90% of the work on UI and 10% of the work on algorithms, which may be another thing that explains it.

                                                                                                          1. 4

                                                                                                            I think that’s a Texas sharp shooter fallacy- just because some things made it out into the world it does not mean that a larger proportion of good ideas in systems work make it out into the world.

                                                                                                            1. 4

                                                                                                              I’m specifically arguing that there has to be an economic incentive, and there’s less in dev tools, and more in systems.

                                                                                                              Even though you didn’t provide any evidence for your claims, or specific experiences, here are dozens and dozens of papers that have been deployed in Linux:

                                                                                                              https://github.com/oilshell/blog-code/blob/master/grep-for-papers/linux.txt

                                                                                                              and LLVM:

                                                                                                              https://github.com/oilshell/blog-code/blob/master/grep-for-papers/llvm.txt

                                                                                                              1. 3

                                                                                                                Yes, and that’s the fallacy. Just because those papers made it into production code, it doesn’t mean that systems research in general gets into production more than other research.

                                                                                                                1. 4

                                                                                                                  I’m not claiming the evidence is airtight, just that it’s well supported, matches my experience, and you didn’t provide any evidence for your claim.

                                                                                                                  I don’t see how you can possibly claim that all fields of CS are equally bad at commercializing their work. On its face it seems silly. Some fields are closer to practice than others – if you’ve ever worked in an industrial research setting, that would be obvious.

                                                                                                                  1. 2

                                                                                                                    You could also consider these two examples as an approach to avoid the problem: Get the research tool under the umbrella of a bigger FOSS project like Linux or LLVM here.

                                                                                                          1. 2

                                                                                                            Corresponding Rosetta page. Zig is still missing there.