1.  

      The goal is definitely GNU-free, but yea, it still depends on gmake to build some packages. It’s the only GNU dependency, too. A gmake replacement would finish the job.

      1.  

        Seems that you would have to replace freetype as well.

        Curious to read a little bit more about the rationale though. What’s so wrong about GNU software?

        1. 5

          I think one advantage is that GNU has had something of a “monopoly” in a few areas, which hasn’t really improved things. The classic example of this is gcc; everyone had been complaining about its cryptic error messages for years and nothing was done. Clang enters the scene and lo an behold, suddenly it all could be improved.

          Some more diversity isn’t a bad thing; generally speaking I don’t think most GNU projects are of especially high quality, just “good enough” to replace Unix anno 1984 for their “free operating system”.

          Personally I wouldn’t go so far so make a “GNU free Linux”, but in almost every case where a mature alternative to a GNU project exists, the alternative almost always is clearly the better choice. Sometimes these better alternatives have existed for years or decades, but there’s a lot of inertia to get some of these GNU things replaced, and some effort to show “hey, X is actually a lot better than GNU X” isn’t a bad thing.

          1.  

            From the site:

            Why
            • Improve portability of open source software
            • Reduce requirements on GNU packages
            • Prove the “It’s not Linux it’s GNU/Linux …” copypasta wrong
            1.  

              Why not? (I’m not affiliated with the project, you’d have to ask the people maintaining it)

              1.  

                Yeah, “why not?” is a valid reason imvho. I would like to know which one it theirs in actuality. I often find that the rationale behind a project is a good way to learn things.

                And fair enough, I assumed you were affiliated. FWIW, Freetype is not a GNU project, but it is indeed fetched from savannah in their repos, which I found slightly funny.

                ETA: it also seems to be a big endeavor so the rationale becomes even more interesting to me.

              2.  

                Freetype is available under its own permissive license.

              3.  

                Why would anyone want to write a GNU make replacement when GNU make exists?

                1.  

                  They don’t want to do that. They want a non-GNU replacement for GNU make

                  1.  

                    They could just fork GNU make. That would work right?

                    1.  

                      The entire point here is to not contain any GNU code, so no.

                      1.  

                        zlib is derived from GNU code (gzip) so anything that includes zlib or libpng etc will “contain GNU code”. This includes for example the Linux kernel.

                      2.  

                        Or use bsdmake.

                        1.  

                          bsdmake is in fact included, but it can’t build all gmake makefiles unfortunately.

                          1.  

                            I used to do a fair bit of packaging on FreeBSD, and avoiding things like GNU make, autotools, libtool, bash, etc. will be hard and a lot of effort. You’ll essentially have to rewrite a lot of project’s build systems.

                            Also GTK is GNU, and that’ll just outright exclude whole swaths of software, although it’s really just “GNU in name only” as far as I know.

                2.  

                  From the site:

                  Iglunix is a Linux distribution but, unlike almost all other Linux distributions, it has no GNU software¹

                  ¹With the exception of GNU make for now

                  1. 30

                    “People who really love a language criticizing it” is one of my favorite genres of blog post. I don’t know why but I love to read this stuff even if I have no intention of using Julia.

                    The startup and memory-intensive issues are really familiar to me as someone who’s spent a lot of time in Clojure. Really hammers home the point that there’s no such thing as a one-size-fits-all language, and it’s OK to focus on being excellent in a niche even if it means that no one can use your language to implement (say) grep.

                    1. 16

                      Such sentiment is frankly one of my favorite indications of someone’s experience with a tool. One cannot truly understand a tool until one can also (constructively) criticize it. My favorite software engineer interview question is to ask the interviewee what their favorite or preferred language is and what they like about it, then ask them what they dislike about it or warn new users about it. I get stereotypical answers to the softball first part but some really interesting answers to the second part when someone really knows their stack or some real indication that someone just doesn’t have the depth of experience I may be looking for in a role necessitating significant knowledge of both sides of the blade.

                      1.  

                        My favorite software engineer interview question is to ask the interviewee what their favorite or preferred language is and what they like about it, then ask them what they dislike about it or warn new users about it.

                        My problem is I’m way more enthusiastic talking about the second one than the first one. There’s just so much more to talk about there!

                    1. 8

                      Fun to compare this to What’s bad about Julia?, another post where an expert talks about criticisms to their favorite language. Except in the Julia post, the vibe is less “expert dismissing any criticism about $lang as bonkers” and more “expert openly talking about the pros and cons of $lang.” I know which kind of post I prefer.

                      1.  

                        You don’t need to tell us about every single patch bump, especially when you don’t contribute anything to lobsters besides announcing patch bumps.

                        1. 5

                          Quality is not negotiable. […] Our only measure of progress is delivering into our customers hands things they find valuable.

                          I once asked Holub on Twitter what you should do if the client doesn’t consider security valuable. His response was something like “then don’t make it secure.”

                          Edit found the thread, someone else was taking the hardline, but he was still handwaving away a lot of the concerns. I’ll be honest, I don’t really Holub’s online personality or any of his stances, so that might be coloring how I see this.

                          1. 3

                            “effective software development” occurs in a realm abstracted away from “software development that people with money pay for”.

                            One would assume they were related somehow, but the computer industry has a long history of particularly successful businesses succeeding despite, rather than because, of their development practices. Alas, business success is then equated, by those desirous of business success, with Good/Effective Software Development practices.

                            And so life gets worse.

                            1. 3

                              I think the comparison to radium/asbestos is apt, and ultimately while I try to practice some basic degree of professional ethics myself, I don’t think we can really dig ourselves out of this at scale without externally imposed regulations.

                              See also: https://idlewords.com/talks/haunted_by_data.htm

                              1. 1

                                I’ll be honest, I don’t really Holub’s online personality or any of his stances, so that might be coloring how I see this.

                                I appreciate your honesty and share your feelings. I find that Holub’s attitude tends toward the opposite extremes of “idealism” and, at the other end, “give them what they ask for” which can be hard to reconcile. I see shades of that extreme dichotomy of approach in this list.

                              1. 3

                                Welcome to Lobsters! Generally we don’t use the story text to explain the story, we put that in a comment instead.

                                (Also, we strongly encourage people to share articles and contribute to the community in ways besides advertising their products.)

                                1. 9

                                  Good writeup! One interesting thing is that it was harder to find where database calls were happening in the “abstracted” ruby than in the “inline” ruby. Best practices for functionality conflicting with best practices for optimization.

                                  1. 11

                                    For anyone with some Rails experience, this is a well-known problem: ActiveRecord has magic accessors which will make a query on-the-fly when you try to access a property that hasn’t been loaded. It then caches it so when you next access it, it will not perform a query. Django has similar behaviour (although it is slightly simpler). This can result in extremely hard to fathom code. “Is this doing a query? Why/why not?” is what you’ll be asking yourself all the time, and there are often no clues in the code you’re looking at: some other method may have already cached the data for you once you hit the code you’re looking at.

                                    Sometimes, a method can be a bottleneck because it does a lot of queries, but only on a particular code path, because when coming through another code path, it will be passed a pre-cached object. Doing performance analysis on such code bases can be quite frustrating!

                                    1. 4

                                      Is there a generally-accepted way to deal with this problem? Is it just “don’t do that”? Asking as someone with little to no Rails experience.

                                      1. 3

                                        “It’s slow” is always hard to debug; “It’s slow, but only sometimes” more so.

                                        As an experienced rails dev, I lean heavily on runtime instrumentation in production.

                                        Looking for “where is the webserver spending most of its time” rapidly identifies the highest-priority issues; every serious instrumentation platform offers a straightforward way to view all call-sites which generate queries, along with their stack traces. From there it’s pretty easy to identify what the issue is; deciding where to add an explicit preload is the hardest part.

                                        1. 2

                                          I don’t know, but AFAIK you’re supposed to know what your access pattern is going to look like and prefetch/side load when needed. Or discover it when you find out there’s a bottleneck and fix it then.

                                          1.  

                                            generally and rails-specific are pretty different.

                                            For ActiveRecord, there’s bullet, which helps avoid n+1s. But there’s no rails-native way of doing it, as far as I know. Lots of teams re-invent their own wheels.

                                        2. 5

                                          Interesting observation! It reminds me about this post. (My) TL;DR there is that extracting functions only helps for pure code; when there are mutations to global state, it’s easier to see them all in a single place. It seems that OpenGL context and database connection are similar cases.

                                          This immediately suggests a rule-of-thumb for rails apps: do not abstract over SQL queries. When handling a single HTTP request, there should be one function, which lexically contains all the SQL queries for this request. It can call into helper (pure) functions, as long as they don’t issue queries themselves.

                                          1. 3

                                            It can call into helper (pure) functions, as long as they don’t issue queries themselves.

                                            As I pointed out in my sibling comment to yours, this is difficult to verify. You could have one method that performs a query and side-loads related objects from the other side of a belongs_to relation, and another that does some processing on those objects. But processing the objects requires following the relation, which will trigger a query when the objects haven’t been side-loaded. So you could have another method that forgets to do the side-loading and voila, your “pure” helper function now issues a query under the hood.

                                          2. 3

                                            Yes! Though I think even for functionality the inlining proved helpful. We encountered many cases where we were calling a function that took an array of elements but passed it only a single element. After inlining such a function we could remove a loop. In other cases we were calling a function with a hardcoded argument, and after inlining that we could remove unused conditional branches. It was super interesting to see how much logic evaporated in this process. Code that was useful at some point, but not anymore.

                                            I like what Sandi Metz wrote on this subject in The Wrong Abstraction.

                                          1. 1

                                            The feeling I have is that AI may do for art and music what it has done for playing Chess and Go.

                                            Indeed, ruin them.

                                            1. 5

                                              I don’t think that’s fair. AIs are better than humans in chess, but it’s widely accepted that they’ve made human-level play much better, by making chess training and board analysis widely available even to people who can’t afford coaches.

                                            1. 10

                                              I’ve also been totally consumed by the same obsession. Lots of money, too: I want to be able to distribute VQGAN models on multiple gpus, which is far beyond my minimal pytorch knowhow. So I’m now looking to pay a contractor.

                                              I have this dream of making 2000x2000 pixel murals and printing them to canvas. AWS has EC2 configs with 96 gigs of gpu. I can’t stop thinking about this, and it’s disrupting my life.

                                              But it’s also exhilarating. I know it’s “just” an ai generator, but I’m still proud of the stuff I “make”. Here are some of my favorites:

                                              My daughter wants to be an artist. What should I tell her? Will this be the last generation of stylists, and we’ll just memorize the names of every great 20th century artist to produce things we like, forever?

                                              I worry about this too, but also am excited to see what artists do when they have these tools. And I think it’ll make artists turn more to things like sculpture and ceramics and other forms of art that are still out of the machine’s reach.

                                              EDIT: also, a friend and I have been making games based off this. “Guess the city this from this art” or “guess the media franchise”. It does really funny stuff to distinct media styles, like if you put in “homestar runner”

                                              1. 5

                                                Just my random observation but “your” pieces and the post’s all give the vague appearance of something running through endless loops of simulacra. Said another way, they all share similar brush strokes.

                                                I think we’re headed into the (while looking at a Pollock) — “humph, my AI could have painted that!” era

                                                1. 4

                                                  There are a bunch of known public Colab notebooks but one is very popular. It’s fast but has this recognizable “brush stroke” indeed. Some GAN jockeys are tweaking the network itself though, and they easily get very different strokes at decent speeds. You don’t even need to know neural network math to tweak, just the will to dive in it. Break stuff, get a feel for what you like. If this is to become a staple artist’s tool it’ll have to be like that, more than just feeding queries.

                                                2. 3

                                                  These are cool. The “Old gods” one especially… if that was hung in your house and you told me you’d purchased it from an artist I wouldn’t blink. When you make them, are you specifying and tweaking the style, and then generating a bunch, and then hand-picking the one you like?

                                                  1. 3

                                                    Starting out I was just plugging intrusive thoughts into colab to see what I’d get. If it didn’t produce something interesting (not many do) I’d try another prompt. Recently I spent a lot of time writing a “pipeliner” program so I can try the same prompt on many different configs at once. I got the MVP working on Monday, but I’m putting it aside a while so I can focus on scaling (it only works on one GPU, so can’t make anything bigger than 45k square pixels or so)

                                                    1. 1

                                                      Are you saying you’ve managed to get this to run locally? All the guides I’ve found are simply how to operate a Colab instance.

                                                      1. 2

                                                        I got it running locally, but I don’t have a GPU so upload it to an EC2 instance. I recently found that SageMaker makes this way easier and less burdensome, though.

                                                  2. 1

                                                    There are neural nets intended specifically for upscaling images. Pairing one of these with VQGAN image generation (which is pretty low res) might let you make larger scale art without a huge unaffordable GPU.

                                                  1. 2

                                                    What I like from Python 3.6 is CPython preseves the order of dicts by default, and hopefully this will become more than just an implementation detail.

                                                    1. 8

                                                      It’s official as of 3.7.

                                                      1. 1

                                                        Good to know, thanks!

                                                    1. 17

                                                      These rules make perfect sense in a closed, Google-like ecosystem. Many of them don’t make sense outside of that context, at least not without serious qualifications. The danger with articles like this is that they don’t acknowledge the contextual requirements that motivate each practice, making them liable for cargo-culting into situations where they end up doing more harm than good.

                                                      Automate common tasks

                                                      Absolutely — unless building and maintaining that automation takes more time than just doing it manually. Which tends to happen, especially when you don’t have a team dedicated to infrastructure, and spending time on automation necessarily means not spending time on product development. Programmers love to overestimate the cost of toil, and the benefit of avoiding it; and to underestimate the cost of building and running new software.

                                                      Stubs and mocks make bad tests

                                                      Stubs and mocks are tools for unit testing, just one part of a complete testing breakfast. Without them, it’s more difficult to achieve encapsulation, build strong abstractions, and keep complex systems coherent. You need integration tests, absolutely! But if you just have integration tests, you’re stacking the deck against yourself architecturally.

                                                      Small frequent releases

                                                      No objection.

                                                      Upgrade dependencies early, fast, and often

                                                      Big and complex dependencies, subject to CVEs, and especially if they interface with out-of-process stuff that may not retain a static API? Absolutely. Smaller dependencies, stuff that just serves a single purpose? It’s make-work, and adds a small amount of continuous risk to your deployments — even small changes can introduce big bugs that skirt past your test processes — which may not be the best choice in all environments.

                                                      Expert makes everyone’s update

                                                      (Basically: update your consumers for them.) This one in particular is so pernicious. The relationship between author and consumer is one to many, with no upper bound on the many. Authors always owe some degree of care and responsibility to their consumers, but not, like, total fealty. That’s literally impossible in open ecosystems, and even in closed ones, taking it to this extreme rarely makes sense in the cost/benefit sense. Software is always an explorative process, and needs to change to stay healthy; extending authors’ domain of responsibility literally into the codebases of their consumers makes change just enormously difficult. That’s appropriate in some circumstances, where the cost of change is very high! But the cost of change is not always very high. Sometimes, often, it’s more important to let authors evolve their software relatively unconstrained, then to bind them to Hyrum’s Law.

                                                      1. 9

                                                        Stubs and mocks are tools for unit testing, just one part of a complete testing breakfast. Without them, it’s more difficult to achieve encapsulation, build strong abstractions, and keep complex systems coherent.

                                                        extending authors’ domain of responsibility.. into the codebases of their consumers.. is appropriate in some circumstances..

                                                        The second bullet here rebuts the first if you squint a little. When subsystems have few consumers (the predominant case for integration tests), occasionally modifying a large number of tests is better than constantly relying on stubs and mocks.

                                                        You can’t just dream up strong abstractions on a schedule. Sometimes they take time to coalesce. Overly rigid mocking can prematurely freeze interfaces.

                                                        1. 2

                                                          I’m afraid I don’t really understand what you’re getting at here. I want to! Do you maybe have an example?

                                                          You can’t just dream up strong abstractions on a schedule. Sometimes they take time to coalesce. Overly rigid mocking can prematurely freeze interfaces.

                                                          I totally agree! But mocking at component encapsulation boundaries isn’t a priori rigid, I don’t think?

                                                          When subsystems have few consumers (the predominant case for integration tests), occasionally modifying a large number of tests is better than constantly relying on stubs and mocks.

                                                          I understand integration tests as whole-system, not subsystem. Not for you?

                                                          1. 2

                                                            I need to test one subsystem. I could either do that in isolation using mocks to simulate its environment, or in a real environment. That’s the trade-off we’re talking about, right? When you say “integration tests make it difficult to achieve encapsulation” I’m not sure what you mean. My best guess is that you’re saying mocks force you to think about cross-subsystem interfaces. Does this help?

                                                            1. 2

                                                              What is a subsystem? Is it a single structure with state and methods? A collection of them? An entire process?

                                                              edit:

                                                              I need to test one subsystem. I could either do that in isolation using mocks to simulate its environment, or in a real environment. That’s the trade-off we’re talking about, right? When you say “integration tests make it difficult to achieve encapsulation” I’m not sure what you mean.

                                                              Programs are a collection of components that provide capabilities and depend on other components. So in the boxes-and-lines architecture diagram sense, the boxes. They encapsulate the stuff they need to do their jobs, and provide their capabilities as methods (or whatever) to their consumers. This is what I’m saying should be testable in isolation, with mocks (fakes, whatever) provided as dependencies. Treating them independently in this way encourages you to think about their APIs, avoid exposing internal details, etc. etc. — all necessary stuff. I’m not saying integration tests make that difficult, I’m saying if all you have is integration tests, then there’s no incentive to think about componentwise APIs, or to avoid breaking encapsulation, or whatever else. You’re treating the whole collection of components as a single thing. That’s bad.

                                                              If you mean subsystem as a subset of inter-related components within a single application, well, I wouldn’t test anything like that explicitly.

                                                              1. 2

                                                                All I mean by it is something a certain kind of architecture astronaut would use as a signal to start mocking :) I’ll happily switch to “component” if you prefer that. In general conversations like this I find all these nouns to be fairly fungible.

                                                                More broadly, I question your implicit premise that encapsulation and whatnot is something to pursue as an end in itself. When I program I try to gradually make the program match its domain. My programs tend to start out as featureless blobs and gradually acquire structure as I understand a domain and refactor. I don’t need artificial pressures to progress on this trajectory. Even in a team context, I don’t find teams that use them to be better than teams that don’t.

                                                                I wholeheartedly believe that tests help inexperienced programmers learn to progress on this trajectory. But unit vs integration is in the noise next to tests vs no tests.

                                                                1. 2

                                                                  But unit vs integration is in the noise next to tests vs no tests.

                                                                  My current company is a strong counterpoint against this.

                                                                  Lots of integration tests, which have become sprawling, slow, and flaky.

                                                                  Very few unit tests – not coincidentally, the component boundaries are not crisp, how things relate is hard to follow, and dependencies are not explicitly passed in (so you can’t use fakes). Hence unit tests are difficult to write. It’s a case study in the phenomenon @peterbourgon is describing.

                                                                  1. 2

                                                                    I’ve experienced it as well. I’ve also experienced the opposite, codebases with egregious mocking that were improved by switching to integration tests. So I consider these categories to be red herrings. What matters is that someone owns the whole, and takes ownership of the whole by constantly adjusting boundaries when that’s needed.

                                                                    1. 2

                                                                      codebases with egregious mocking

                                                                      Agreed, I’ve seen this too.

                                                                      So I consider these categories to be red herrings.

                                                                      I don’t think this follows though. Ime, the egregious mocking always results from improper application code design or improper test design. That is, any time I’ve seen a component like that, the design (of the component, of the test themselves, or of higher parts of the system in which the component is embedded) has always been faulty, and the hard-to-understand mocks would melt away naturally when that was fixed.

                                                                      What matters is that someone owns the whole, and takes ownership of the whole by constantly adjusting boundaries when that’s needed.

                                                                      Per the previous point, ownership alone won’t help if the owner’s design skills aren’t good enough. I see no way around this, though I wish there were.

                                                                  2. 2

                                                                    More broadly, I question your implicit premise that encapsulation and whatnot is something to pursue as an end in itself. When I program I try to gradually make the program match its domain. My programs tend to start out as featureless blobs and gradually acquire structure as I understand a domain and refactor. I don’t need artificial pressures to progress on this trajectory. Even in a team context, I don’t find teams that use them to be better than teams that don’t.

                                                                    This is a fine process! Follow it. But when you put your PR up for review or whatever, this process needs to be finished, and I need to be looking at well-thought-out, coherent, isolated, and, yes, encapsulated components. So I think it is actually a goal in itself. Technically it’s meant to motivate coherence and maintainability, but I think it’s an essential part of those things, not just a proxy for them.

                                                          2. 5

                                                            Stubs and mocks are tools for unit testing, just one part of a complete testing breakfast. Without them, it’s more difficult to achieve encapsulation, build strong abstractions, and keep complex systems coherent. You need integration tests, absolutely! But if you just have integration tests, you’re stacking the deck against yourself architecturally.

                                                            Traditional OO methodology encourages you to think of your program as loosely coupled boxes calling into each other, and your unit test should focus on exact one box, and stub out all the other boxes. But it’s not a suitable model for everything.

                                                            Consider a simple function for calculating factorial of n: when you write a unit test for it, you wouldn’t stub out the * operation, you take it for granted. But in a pure OO sense, the * operation is a distinct “box” that the factorial function is calling into, so a unit test that doesn’t stub out * is technically an integration test, and a “real” unit test should stub it out too. But we know that the latter is just meaningless (you’ll essentially be re-implementing *, but for a small set of operands in the stubs) and we still happily call the former a unit test.

                                                            A more suitable model for this scenario is to think of some of dependencies as an implementation detail, and instead of stubbing them out, use either the real thing or something that replicates its behavior (called “fakes” in Google). These boxes might still be dependencies in a technical sense (e.g. subject to dependency injection), but they should be considered “hidden” in an architectural sense. The * operation in the former example is one such dependency. If you are unit testing some web backend, databases often fall into this category too.

                                                            Still, the real world is quite complex, and there are often cases that straddle the line between a loosely-coupled-box dependency and a mostly-implementation-detail dependency. Choosing between them is a constant tradeoff and requires evaluation of usage patterns. Even the * operation could cross over from the latter category to the former, if you are implementing a generic function that supports both real number multiplications and matrix multiplications, for example.

                                                            1. 6

                                                              Consider a simple function for calculating factorial of n: when you write a unit test for it, you wouldn’t stub out the * operation, you take it for granted. But in a pure OO sense, the * operation is a distinct “box” that the factorial function is calling into, so a unit test that doesn’t stub out * is technically an integration test, and a “real” unit test should stub it out too.

                                                              Imo this is a misunderstanding (or maybe that’s what you’re arguing?). You should only stub out (and use DI for) dependencies with side effects (DB calls, network calls, File I/O, etc). Potentially if you had some really slow, computationally expensive pure function, you could stub that too. I have never actually run into this use-case but can imagine reasons for it.

                                                              1. 2

                                                                I think we’re broadly in agreement.

                                                                But in a pure OO sense, the * operation is a distinct “box” that the factorial function is calling into, so a unit test that doesn’t stub out * is technically an integration test

                                                                Well, these terms aren’t well defined, and I don’t think this is a particularly useful definition. The distinct boxes are the things that exist in the domain of the program (i.e. probably not language constructs) and act as dependencies to other boxes (i.e. parameters to constructors). So if factorial took multiply as a dependency, sure.

                                                                instead of stubbing them out, use either the real thing or something that replicates its behavior

                                                                Names, details, it’s all fine. The only thing I’m claiming is important is that you’re able to exercise your code, at some reasonably granular level of encapsulation, in isolation.

                                                                If you have a component that’s tightly coupled to the database with bespoke SQL, then consider it part of the database, and use “the real thing” in tests. Sure. Makes sense. But otherwise, mocks (fakes, whatever) are a great tool to get to this form of testability, which is in my experience the best proxy for “code quality” that we got.

                                                              2. 4

                                                                Absolutely — unless building and maintaining that automation takes more time than just doing it manually. Which tends to happen, especially when you don’t have a team dedicated to infrastructure, and spending time on automation necessarily means not spending time on product development.

                                                                Obligatory relevant XKCDs:

                                                                1. 2

                                                                  Stubs and mocks are tools for unit testing,

                                                                  Nope.

                                                                  Why I don’t mock

                                                                  1. 4

                                                                    Nope

                                                                    That mocks are tools for unit testing is a statement of fact?

                                                                    Why I don’t mock

                                                                    I don’t think we’re talking about the same thing.

                                                                    1. 1

                                                                      Mocks are tools for unit testing the same way hammers are tools for putting in screws.

                                                                      1. 2

                                                                        A great way to make pilot holes so you don’t split your board while putting the screw in?

                                                                        1. 1

                                                                          A great way to split hairs without actually putting a screw in? ¯\(ツ)

                                                                          1. 1

                                                                            You seem way more interested in making dropping zingers than actually talking about your position.

                                                                            1. 1

                                                                              I already spelled out my position in detail in my linked article, which echoes the experience that the Google book from TFA talks about.

                                                                              Should I copy-paste it here?

                                                                              Mocks are largely a unit-testing anti-pattern, they can easily make your tests worse than useless, because you believe you have real tests, but you actually do not. This is worse than not having tests and at least knowing you don’t have tests. (It is also more work). Stubs have the same structural problems, but are not quite as bad as mocks, because they are more transparent/straightforward.

                                                                              Fakes are OK.

                                                                              1. 1

                                                                                Mocks, stubs, fakes — substitutes for the real thing. Whatever. They play the same role.

                                                                                1. 1

                                                                                  They are not the same thing and do not play the same role.

                                                                                  I recommend that you learn why and how they are different.

                                                                                  1. 2

                                                                                    I understand the difference, it’s just that it’s too subtle to be useful.

                                                                                    1. 0

                                                                                      I humbly submit that if you think the difference is too subtle to be useful, then you might not actually understand it.

                                                                                      Because the difference is huge. And it seems that Google Engineering agrees with me. Now the fact that Google Engineering believes something doesn’t automatically make it right, they can mess up like anyone else. On the other hand, they have a lot of really, really smart engineers, and a lot of experience building a huge variety of complex systems. So it seems at least conceivable that all of us (“Google and Me”, LOL) might have, in the tens or hundreds of thousands of engineer-years figured out that a distinction that may seem very subtle on the surface is, in fact, profound.

                                                                                      Make of that what you will.

                                                                        2. 1

                                                                          I’m sure we’re not talking about the same thing.

                                                                  1. 2

                                                                    The first version of this query was buggy, because I carelessly used the obvious-looking condition “WHERE user1_id = 5 OR user2_id = 5”. This condition is wrong.

                                                                    For the slower among us, what makes this condition wrong? Is it because the SELECT only gets one of the two user_id values?

                                                                    Both of those models frankly feel somehow weird, they go strongly against the usual effortlessness of relational database modeling. Maybe this is because of the additional invariants that are not handled directly by the table structure?

                                                                    It feels like the “right” approach would be some kind of set type, where the values are collections unique unordered user_ids. Then the table constraint is |set| = 2 and friendship tests are select ... where 5 in set. But I don’t know if any SQL databases have a set data type.

                                                                    1. 1

                                                                      I think I wrote something like

                                                                      SELECT user2_id
                                                                      FROM mutual_friendship
                                                                      WHERE user1_id = 5 OR user2_id = 5
                                                                      

                                                                      And this version survived several minutes of writing until I realized that it’s wrong.

                                                                      You could rewrite it something like (as suggested on Reddit):

                                                                      SELECT CASE user1_id WHEN 5 THEN user2_id ELSE user1_id END
                                                                      FROM mutual_friendship
                                                                      WHERE user1_id = 5 OR user2_id = 5
                                                                      

                                                                      I guess the idea of that sentence is that this “double-part” complexity needs to live somewhere, and any query would be somewhat awkward. Maybe I should rewrite it a bit better.

                                                                      It feels like the “right” approach would be some kind of set type

                                                                      I tried playing with that idea, but so far all attempts, when compiled to a classic relational framework (scalar columns), do not become more beautiful.

                                                                      My idea was also that maybe we should just treat each friendship as a ordered tuple (with tuple-typed columns), but that does not allow elegantly querying list of friends.

                                                                      1. 1

                                                                        if you don’t have any other fields in the friendship table, maybe it’d make sense to store the undirected friendship edge as two rows in the table. An undirected graph with edges (v, u) can be converted to a directed one if you store two symmetric edges (v -> u), (u -> v). You can still INSERT and DELETE like now, and you can manage the symmetric edge with triggers. Perhaps this approach will simplify all those things that appear because of symmetry. I know you say this in the article, but n versus 2n isn’t that big a deal.

                                                                        1. 1

                                                                          that’s what we started from, the two-row representation. In this thread we’re trying to go higher, beyond that.

                                                                    1. 38

                                                                      FWIW the motivation for this was apparently a comment on a thread about a review of the book “Software Engineering at Google” by Titus Winters, Tom Manshreck, and Hyrum Wright.

                                                                      https://lobste.rs/s/9n7aic/what_i_learned_from_software_engineering

                                                                      I meant to comment on that original thread, because I thought the question was misguided. Well now that I look it’s actually been deleted?

                                                                      Anyway the point is that is that the empirical question isn’t really actionable IMO. You could “answer it” and it still wouldn’t tell you what to do.

                                                                      I think you got this post exactly right – there’s no amount of empiricism that can help you. Software engineering has changed so much in the last 10 or 20 years that you can trivially invalidate any study.

                                                                      Yaron Minsky has a saying that “there’s no pile of sophomores high enough” that is going to prove anything about writing code. (Ironically he says that in advocacy of static typing, which I view as an extremely domain specific question.) Still I agree with his general point.


                                                                      This is not meant to be an insult, but when I see the names Titus Winters and Hyrum Wright, I’m less interested in the work. This is because I worked at Google for over a decade and got lots of refactoring and upgrade changelists/patches from them, as maintainer of various parts of the codebase. I think their work is extremely valuable, but it is fairly particular to Google, and in particular it’s done without domain knowledge. They are doing an extremely good job of doing what they can to improve the codebase without domain knowledge, which is inherent in their jobs, because they’re making company-wide changes.

                                                                      However most working engineers don’t improve code without domain knowledge, and the real improvements to code require such knowledge. You can only nibble at the edges otherwise.

                                                                      @peterbourgon said basically what I was going to say in the original thread – this is advice is generally good in the abstract, but it lacks context.

                                                                      https://lobste.rs/s/9n7aic/what_i_learned_from_software_engineering

                                                                      The way I learned things at Google was to look at what people who “got things done” did. They generally “break the rules” a bit. They know what matters and what doesn’t matter.

                                                                      Jeff Dean and Sanjay Ghewamat indeed write great code and early in my career I exchanged a few CLs with them and learned a lot. I also referenced a blog post by Paul Bucheit in The Simplest Explanation of Oil.

                                                                      For those who don’t know, he was creator of GMail, working on it for 3 years as a side project (and Gmail was amazing back then, faster than desktop MS Outlook, even though it’s rotted now.) He mentions in that post how he prototyped some ads with the aid of some Unix shell. (Again, ads are horrible now, a cancer on the web – back then they were useful and fast. Yes really. It’s hard to convey the difference to someone who wasn’t a web user then.)

                                                                      As a couple other anecdotes, I remember people a worker complaining that Guido van Rossum’s functions were too long. (Actually I somewhat agreed, but he did it in service of getting something done, and it can be fixed later.)

                                                                      I also remember Bram Moolenaar’s (author of Vim) Java readability review, where he basically broke all the rules and got angry at the system (for a brief time I was one of the people who picked the Python readability reviewers, so I’m familiar with this style of engineering. I had to manage some disputes between reviewers and applicants.).

                                                                      So you have to take all these rules with a grain of salt. These people can obviously get things done, and they all do things a little differently. They don’t always write as many tests as you’d ideally like. One of the things I tried to do as the readability reviewer was to push back against dogma and get people to relax a bit. There is value to global consistency, but there’s also value to local domain-specific knowledge. My pushing back was not really successful and Google engineering has gotten more dogmatic and sclerotic over the years. It was not fun to write code there by the time I left (over 5 years ago)


                                                                      So basically I think you have to look at what people build and see how they do it. I would rather read a bunch of stories like “Coders at Work” or “Masterminds of Programming” than read any empirical study.

                                                                      I think there should be a name for this empirical fallacy (or it probably already exists?) Another area where science has roundly failed is nutrition and preventative medicine. Maybe not for the same exact reasons, but the point is that controlled experiments are only one way of obtaining knowledge, and not the best one for many domains. They’re probably better at what Taleb calls “negative knowledge” – i.e. disproving something, which is possible and valuable. Trying to figure out how to act in the world (how to create software) is less possible. All things being equal, more testing is better, but all things aren’t always equal.

                                                                      Oil is probably the most rigorously tested project I’ve ever worked on, but this is because of the nature of the project, and it isn’t right for all projects as a rule. It’s probably not good if you’re trying to launch a video game platform like Stadia, etc.

                                                                      1. 8

                                                                        Anyway the point is that is that the empirical question isn’t really actionable IMO. You could “answer it” and it still wouldn’t tell you what to do.

                                                                        I think you got this post exactly right – there’s no amount of empiricism that can help you.

                                                                        This was my exact reaction when I read the original question motivating Hillel’s post.

                                                                        I even want to take it a step further and say: Outside a specific context, the question doesn’t make sense. You won’t be able to measure it accurately, and even if you could, there would such huge variance depending on other factors across teams where you measured it that your answer wouldn’t help you win any arguments.

                                                                        I think there should be a name for this empirical fallacy

                                                                        It seems especially to afflict the smart and educated. Having absorbed the lessons of science and the benefits of skepticism and self-doubt, you can ask of any claim “But is there a study proving it?”. It’s a powerful debate trick too. But it can often be a category error. The universe of useful knowledge is much larger than the subset that has been (or can be) tested with a random double blind study.

                                                                        1. 5

                                                                          I even want to take it a step further and say: Outside a specific context, the question doesn’t make sense. You won’t be able to measure it accurately, and even if you could, there would such huge variance depending on other factors across teams where you measured it that your answer wouldn’t help you win any arguments.

                                                                          It makes a lot of sense to me in my context, which is trying to convince skeptical managers that they should pay for my consulting services. But it’s intended to be used in conjunction with rhetoric, demos, case studies, testimonials, etc.

                                                                          It seems especially to afflict the smart and educated. Having absorbed the lessons of science and the benefits of skepticism and self-doubt, you can ask of any claim “But is there a study proving it?”. It’s a powerful debate trick too. But it can often be a category error. The universe of useful knowledge is much larger than the subset that has (or can) be tested with a random double blind study.

                                                                          I’d say in principle it’s Scientism, in practice it’s often an intentional sabotaging tactic.

                                                                          1. 1

                                                                            It makes a lot of sense to me in my context, which is trying to convince skeptical managers that they should pay for my consulting services. But it’s intended to be used in conjunction with rhetoric, demos, case studies, testimonials, etc.

                                                                            100%.

                                                                            I should have said: I don’t think it would help you win any arguments with someone knowledgeable. I completely agree that in the real world, where people are making decisions off rough heuristics and politics is everything, this kind of evidence could be persuasive.

                                                                            So a study showing that “catching bugs early saves money” functions here like a white lab coat on a doctor: it makes everyone feel safer. But what’s really happening is that they are just trusting that the doctor knows what he’s doing. Imo the other methods for establishing trust you mentioned – rhetoric, demos, case studies, testimonials, etc. – imprecise as they are, are probably more reliable signals.

                                                                            EDIT: Also, just to be clear, I think the right answer here, the majority of the time, is “well obviously it’s better to catch bugs early than later.”

                                                                            1. 2

                                                                              the majority of the time

                                                                              And in which cases is this false? Is it when the team has lots of senior engineers? Is it when the team controls both the software and the hardware? Is it when OTA updates are trivial? (Here is a knock-on effect: what if OTA updates make this assertion false, but then open up a huge can of security vulnerabilities, which overall negates any benefit that the OTA updates add?) What does a majority here mean? I mean, a majority of 55% means something very different from a majority of 99%.

                                                                              This is the value of empirical software study. Adding precision to assertions (such as understanding that a 55% majority is a bit pathological but a 99% majority certainly isn’t.) Diving into data and being able to understand and explore trends is also another benefit. Humans are motivated to categorize their experiences around questions they wish to answer but it’s much harder to answer questions that the human hasn’t posed yet. What if it turns out that catching bugs early or late is pretty much immaterial where the real defect rate is simply a function of experience and seniority?

                                                                              1. 1

                                                                                This is the value of empirical software study. I think empirical software study is great, and has tons of benefits. I just don’t think you can answer all questions of interest with it. The bugs question we’re discussing is one of those.

                                                                                And in which cases is this false? Is it when the team has lots of senior engineers? Is it when the team controls both the software and the hardware? Is it when OTA updates are trivial? (Here is a knock-on effect: what if OTA updates make this assertion false, but then open up a huge can of security vulnerabilities, which overall negates any benefit that the OTA updates add?)

                                                                                I mean, this is my point. There are too many factors to consider. I could add 50 more points to your bullet list.

                                                                                What does a majority here mean?

                                                                                Something like: “I find it almost impossible to think of examples from my personal experience, but understand the limits of my experience, and can imagine situations where it’s not true.” I think if it is true, it would often indicate a dysfunctional code base where validating changes out of production (via tests or other means) was incredibly expensive.

                                                                                What if it turns out that catching bugs early or late is pretty much immaterial where the real defect rate is simply a function of experience and seniority?

                                                                                One of my points is that there is no “turns out”. If you prove it one place, it won’t translate to another. It’s hard even to imagine an experimental design whose results I would give much weight to. All I can offer is my opinion that this strikes me as highly unlikely for most businesses.

                                                                                1. 4

                                                                                  Why is software engineering such an outlier when we’ve been able to measure so many other things? We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences), we can measure minerals in soil, we can analyze diets, heat transfer, we can even study government policy, elections, and even personality 1 though it’s messy. What makes software engineering so much more complex and context dependent than even a person’s personality?

                                                                                  The fallacy I see here is simply that software engineers see this massive complexity in software engineering because they are software experts and believe that other fields are simpler because software engineers are not experts in those fields. Every field has huge amounts of complexity, but what gives us confidence that software engineering is so much more complex than other fields?

                                                                                  1. 3

                                                                                    Why is software engineering such an outlier when we’ve been able to measure so many other things?

                                                                                    You can measure some things, just not all. Remember the point of discussion here is: Can you empirically investigate the claim “Finding bugs earlier saves overall time and money”? My position is basically: “This is an ill-defined question to ask at a general level.”

                                                                                    We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences)

                                                                                    Yes.

                                                                                    we can measure minerals in soil, we can analyze diets, heat transfer,

                                                                                    Yes.

                                                                                    we can even study government policy

                                                                                    In some way yes, in some ways no. This is a complex situation with tons of confounds, and also a place where policy outcomes in some places won’t translate to other places. This is probably a good analog for what makes the question at hand difficult.

                                                                                    and even personality

                                                                                    Again, in some ways yes, in some ways no. With the big 5, you’re using the power of statistical aggregation to cut through things we can’t answer. Of which there are many. The empirical literature on “code review being generally helpful” seems to have a similar force. You can take disparate measures of quality, disparate studies, and aggregate to arrive at relatively reliable conclusions. It helps that we have an obvious, common sense causal theory that makes it plausible.

                                                                                    What makes software engineering so much more complex and context dependent than even a person’s personality?

                                                                                    I don’t think it is.

                                                                                    Every field has huge amounts of complexity, but what gives us confidence that software engineering is so much more complex than other fields?

                                                                                    I don’t think it is, and this is not where my argument is coming from. There are many questions in other fields equally unsuited to empirical investigation as: “Does finding bugs earlier save time and money?”

                                                                                    1. 2

                                                                                      In some way yes, in some ways no. This is a complex situation with tons of confounds, and also a place where policy outcomes in some places won’t translate to other places. This is probably a good analog for what makes the question at hand difficult.

                                                                                      That hasn’t stopped anyone from performing the analysis and using these analyses to implement policy. That analysis of this data is imperfect is beside the point; it still provides some amount of positive value. Software is in the data dark ages in comparison to government policy; what data driven decision has been made among software engineer teams? I don’t think we even understand whether Waterfall or Agile reduces defect rates or time to ship compared to the other.

                                                                                      With the big 5, you’re using the power of statistical aggregation to cut through things we can’t answer. Of which there are many. The empirical literature on “code review being generally helpful” seems to have a similar force. You can take disparate measures of quality, disparate studies, and aggregate to arrive at relatively reliable conclusions. It helps that we have an obvious, common sense causal theory that makes it plausible.

                                                                                      What’s stopping us from doing this with software engineering? Is it the lack of a causal theory? There are techniques to try to glean causality from statistical models. Is this not in line with your definition of “empirically”?

                                                                                      1. 5

                                                                                        That hasn’t stopped anyone from performing the analysis and using these analyses to implement policy. That analysis of this data is imperfect is beside the point; it still provides some amount of positive value.

                                                                                        It’s not clear to me at all that, as a whole, “empirically driven” policy has had positive value? You can point to successful cases and disasters alike. I think in practice the “science” here is at least as often used as a veneer to push through an agenda as it is to implement objectively more effective policy. Just as software methodologies are.

                                                                                        Is it the lack of a causal theory?

                                                                                        I was saying there is a causal theory for why code review is effective.

                                                                                        What’s stopping us from doing this with software engineering?

                                                                                        Again, some parts of it can be studied empirically, and should be. I’m happy to see advances there. But I don’t see the whole thing being tamed by science. The high-order bits in most situations are politics and other human stuff. You mentioned it being young… but here’s an analogy that might help with where I’m coming from. Teaching writing, especially creative writing. It’s equally ad-hoc and unscientific, despite being old. MFA programs use different methodologies and writers subscribe to different philosophies. There is some broad consensus about general things that mostly work and that most people do (workshops), but even within that there’s a lot of variation. And great books are written by people with wildly different approaches. There are a some nice efforts to leverage empiricism like Steven Pinker’s book and even software like https://hemingwayapp.com/, but systematization can only go so far.

                                                                                    2. 2

                                                                                      We can measure vaccine efficacy and health outcomes (among disparate populations with different genetics, diets, culture, and life experiences)

                                                                                      Good vaccine studies are pretty expensive from what I know, but they have statistical power for that reason.

                                                                                      Health studies are all over the map. The “pile of college sophomores” problem very much applies there as well. There are tons of studies done on Caucasians that simply don’t apply in the same way to Asians or Africans, yet some doctors use that knowledge to treat patients.

                                                                                      Good doctors will use local knowledge and rules of thumb, and they don’t believe every published study they see. That would honestly be impossible, as lots of them are in direct contradiction to each other. (Contradiction is a problem that science shares with apprenticeship from experts; for example IIRC we don’t even know if a high fat diet causes heart disease, which was accepted wisdom for a long time.)

                                                                                      https://www.nytimes.com/2016/09/13/well/eat/how-the-sugar-industry-shifted-blame-to-fat.html

                                                                                      I would recommend reading some books by Nassim Taleb if you want to understand the limits of acquiring knowledge through measurement and statistics (Black Swan, Antifragile, etc.). Here is one comment I made about them recently: https://news.ycombinator.com/item?id=27213384

                                                                                      Key point: acting in the world, i.e. decision making under risk, are fundamentally different than scientific knowledge. Tinkering and experimentation are what drive real changes in the world, not planning by academics. He calls the latter “the Soviet-Harvard school”.

                                                                                      The books are not well organized, but he hammers home the difference between acting in the world and knowledge over and over in many different ways. If you have to have scientific knowledge before acting, you will be extremely limited in what you can do. You will probably lose all your money in the markets too :)


                                                                                      Update: after Googling the term I found in my notes, I’d say “Soviet-Harvard delusion” captures the crux of the argument here. One short definition is the the (unscientific) overestimation of the reach of scientific knowledge.

                                                                                      https://www.grahammann.net/book-notes/antifragile-nassim-nicholas-taleb

                                                                                      https://medium.com/the-many/the-right-way-to-be-wrong-bc1199dbc667

                                                                                      https://taylorpearson.me/antifragile-book-notes/

                                                                                      1. 2

                                                                                        This sounds like empiricism. Not in the sense of “we can only know what we can measure” but in the sense of “I can only know what I can experience”. The Royal Society’s motto is “take nobody’s word for it”.

                                                                                        Tinkering and experimentation are what drive real changes in the world, not planning by academics.

                                                                                        I 100% agree but it’s not the whole picture. You need theory to compress and see further. It’s the back and forth between theory and experimentation that drives knowledge. Tinkering alone often ossifies into ritual. In programming, this has already happened.

                                                                                        1. 1

                                                                                          I agree about the back and forth, of course.

                                                                                          I wouldn’t agree programming has ossified into ritual. Certainly it has at Google, which has a rigid coding style, toolchain, and set of languages – and it’s probably worse at other large companies.

                                                                                          But I see lots of people on this site doing different things, e.g. running OpenBSD and weird hardware, weird programming languages, etc. There are also tons of smaller newer companies using different languages. Lots of enthusiasm around Rust, Zig, etc. and a notable amount of production use.

                                                                                          1. 1

                                                                                            My bad, I didn’t mean all programming has become ritual. I meant that we’ve seen instances of it.

                                                                                        2. 1

                                                                                          Good vaccine studies are pretty expensive from what I know, but they have statistical power for that reason.

                                                                                          Oh sure, I’m not saying this will be cheap. In fact the price of collecting good data is what I feel makes this research so difficult.

                                                                                          Health studies are all over the map. The “pile of college sophomores” problem very much applies there as well. There are tons of studies done on Caucasians that simply don’t apply in the same way to Asians or Africans, yet some doctors use that knowledge to treat patients.

                                                                                          We’ve developed techniques to deal with these issues, though of course, you can’t draw a conclusion with extremely low sample sizes. One technique used frequently to compensate for low statistical power studies in meta studies is called Post-Stratification.

                                                                                          Good doctors will use local knowledge and rules of thumb, and they don’t believe every published study they see. That would honestly be impossible, as lots of them are in direct contradiction to each other. (Contradiction is a problem that science shares with apprenticeship from experts; for example IIRC we don’t even know if a high fat diet causes heart disease, which was accepted wisdom for a long time.)

                                                                                          I think medicine is a good example of empiricism done right. Sure, we can look at modern failures of medicine and nutrition and use these learnings to do better, but medicine is significantly more empirical than software. I still maintain that if we can systematize our understanding of the human body and medicine that we can do the same for software, though like a soft science, definitive answers may stay elusive. Much work over decades went into the medical sciences to define what it even means to have an illness, to feel pain, to see recovery, or to combat an illness.

                                                                                          I would recommend reading some books by Nassim Taleb if you want to understand the limits of acquiring knowledge through measurement and statistics (Black Swan, Antifragile, etc.). Here is one comment I made about them recently: https://news.ycombinator.com/item?id=27213384

                                                                                          Key point: acting in the world, i.e. decision making under risk, are fundamentally different than scientific knowledge. Tinkering and experimentation are what drive real changes in the world, not planning by academics. He calls the latter “the Soviet-Harvard school”.

                                                                                          I’m very familiar with Taleb’s Antifragile thesis and the “Soviet-Harvard delusion”. As someone well versed in statistics, these are theses that are both pedestrian (Antifragile itself being a pop-science look into a field of study called Extreme Value Theory) and old (Maximum Likelihood approaches to decision theory are susceptible to extreme/tail events which is why in recent years Bayesian and Bayesian Causal analyses have become more popular. Pearson was aware of this weakness and explored other branches of statistics such as Fiducial Inference). (Also I don’t mean this as criticism toward you, though it’s hard to make this tone come across over text. I apologize if it felt offensive, I merely wish to draw your eyes to more recent developments.)

                                                                                          To draw the discussion to a close, I’ll try to summarize my position a bit. I don’t think software empiricism will answer all the questions, nor will we get to a point where we can rigorously determine that some function f exists that can model our preferences. However I do think software empiricism together with standardization can offer us a way to confidently produce low-risk, low-defect software. I think modern statistical advances have offered us ways to understand more than statistical approaches in the ‘70s and that we can use many of the newer techniques used in the social and medical sciences (e.g. Bayesian methods) to prove results. I don’t think that, even if we start a concerted approach today to do this, that our understanding will get there in a matter of a few years. To do that would be to undo decades of software practitioners creating systemic analyses from their own experiences and to create a culture shift away from the individual as artisan to a culture of standardization of both communication of results (what is a bug? how does it affect my code? how long did it take to find? how long did it take to resolve? etc) and of team conditions (our team has n engineers, our engineers have x years of experience, etc) that we just don’t have now. I have hope that eventually we will begin to both standardize and understand our industry better but in the near-term this will be difficult.

                                                                              2. 4

                                                                                Here’s a published paper that purposefully illustrates the point you’re trying to make: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC300808/. It’s an entertaining read.

                                                                                1. 1

                                                                                  Yup I remember that from debates on whether to wear masks or not! :) It’s a nice pithy illustration of the problem.

                                                                                2. 2

                                                                                  Actually I found a (condescending but funny/memorable) name for the fallacy – the “Soviet-Harvard delusion” :)

                                                                                  An (unscientific) overestimation of the reach of scientific knowledge.

                                                                                  I found it in my personal wiki, in 2012 notes on the book Antifragile.

                                                                                  Original comment: https://lobste.rs/s/v4unx3/i_ing_hate_science#c_nrdasq

                                                                                3. 3

                                                                                  I’m reading a book right now about 17th century science. The author has some stuff to say about Bacon and Empiricism but I’ll borrow an anecdote from the book. Boyle did an experiment where he grew a pumpkin and measured the dirt before and after. The weight of the dirt hadn’t changed much. The only other ingredient that had been added was water. It was obvious that the pumpkin must be made of only water.

                                                                                  This idea that measurement and observation drive knowledge is Bacon’s legacy. Even in Bacon’s own lifetime, it’s not how science unfolded.

                                                                                  1. 2

                                                                                    Fun fact: Bacon is often considered the modern founder of the idea that knowledge can be used to create human-directed progress. Before him, while scholars and astronomers used to often study things and invent things, most cultures still viewed life and nature as a generally haphazard process. As with most things in history the reality involves more than just Bacon, and there most-certainly were non-Westerners who had similar ideas, but Bacon still figures prominently in the picture.

                                                                                    1. 1

                                                                                      Hm interesting anecdote that I didn’t know about (I looked it up). Although I’d say that’s more an error of reasoning within science? I realized what I was getting at could be called the Soviet-Harvard delusion, which is overstating the reach of scientific knowledge (no insult intended, but it is a funny and memorable name): https://lobste.rs/s/v4unx3/i_ing_hate_science#c_nrdasq

                                                                                      1. 1

                                                                                        To be fair, the vast majority of the mass of the pumpkin is water. So the inference was correct to first order. The second-order correction of “and carbon from the air”, of course, requires being much more careful in the inference step.

                                                                                      2. 2

                                                                                        So basically I think you have to look at what people build and see how they do it. I would rather read a bunch of stories like “Coders at Work” or “Masterminds of Programming” than read any empirical study.

                                                                                        Perhaps, but this is already what happens, and I think it’s about time we in the profession raise our standards, both of pedagogy and of practice. Right now you can take a casual search on the Web and you can find respected talking-heads talk about how their philosophy is correct, despite being in direct contrast to another person’s philosophy. This behavior is reinforced by the culture wars of our times, of course, but there’s still much more aimless discourse than there is consistency in results. If we want to start taking steps to improve our practice, I think it’s important to understand what we’re doing right and more importantly what we’re doing wrong. I’m more interested here in negative results than positive results. I want to know where as a discipline software engineering is going wrong. There’s also a lot at stake here purely monetarily; corporations often embrace a technology methodology and pay for PR and marketing about their methodology to both bolster their reputations and to try to attract engineers.

                                                                                        think there should be a name for this empirical fallacy (or it probably already exists?) Another area where science has roundly failed is nutrition and preventative medicine.

                                                                                        I don’t think we’re even at the point in our empirical understanding of software engineering where we can make this fallacy. What do we even definitively understand about our field? I’d argue that psychology and sociology have stronger well-known results than what we have in software engineering even though those are very obviously soft sciences. I also think software engineers are motivated to think the problem is complex and impossible to be empirical for the same reason that anyone holds their work in high esteem; we believe our work is complicated and requires highly contextual expertise to understand. However if psychology and sociology can make empirical progress in their fields, I think software engineers most definitely can.

                                                                                        1. 2

                                                                                          Do you have an example in mind of the direct contradiction? I don’t see much of a problem if different experts have different opinions. That just means they were building different things and different strategies apply.

                                                                                          Again I say it’s good to “look at what people build” and see if it applies to your situation; not blindly follow advice from authorities (e.g. some study “proved” this, or some guy from Google who may or may not have built things said this was good; therefore it must be good).

                                                                                          I don’t find a huge amount of divergence in the opinions of people who actually build stuff, vs. talking heads. If you look at what says John Carmack says about software engineering, it’s generally pretty level-headed, and he explains it well. It’s not going to differ that much from what Jeff Dean says. If you look at their C++ code, there are even similarities, despite drastically different domains.

                                                                                          Again the fallacy is that there’s a single “correct” – it depends on the domain; a little diversity is a good thing.

                                                                                          1. 4

                                                                                            Do you have an example in mind of the direct contradiction? I don’t see much of a problem if different experts have different opinions. That just means they were building different things and different strategies apply.

                                                                                            Here’s two fun ones I like to contrast: The Unreasonable Effectiveness of Dynamic Typing for Practical Programs (Vimeo) and The advantages of static typing, simply stated. Two separate authors that came to different conclusions from similar evidence. While yes their lived experience is undoubtedly different, these are folks who are espousing (mostly, not completely) contradictory viewpoints.

                                                                                            I don’t find a huge amount of divergence in the opinions of people who actually build stuff, vs. talking heads. If you look at what says John Carmack says about software engineering, it’s generally pretty level-headed, and he explains it well. It’s not going to differ that much from what Jeff Dean says. If you look at their C++ code, there are even similarities, despite drastically different domains.

                                                                                            Who builds things though? Several people build things. While we hear about John Carmack and Jeff Dean, there are folks plugging away at the Linux kernel, on io_uring, on capability object systems, and all sorts of things that many of us will never be aware of. As an example, Sanjay Ghewamat is someone who I wasn’t familiar with until you talked about him. I’ve also interacted with folks in my career who I presume you’ve never interacted with and yet have been an invaluable source of learnings for my own code. Moreover these experience reports are biased by their reputations; I mean of course we’re more likely to listen to John Carmack than some Vijay Foo (not a real person, as far as I’m aware) because he’s known for his work at iD, even if this Vijay Foo may end up having as many or more actionable insights than John Carmack. Overcoming reputation bias and lack of information about “builders” is another side effect I see of empirical research. Aggregating learnings across individuals can help surface lessons that otherwise would have been lost due to structural issues of acclaim and money.

                                                                                            Again the fallacy is that there’s a single “correct” – it depends on the domain; a little diversity is a good thing.

                                                                                            This seems to be a sentiment I’ve read elsewhere, so I want to emphasize: I don’t think there’s anything wrong with diversity and I don’t think Emprical Software Engineering does anything to diversity. Creating complicated probabilistic models of spaces necessarily involve many factors. We can create a probability space which has all of the features we care about. Just condition against your “domain” (e.g. kernel work, distributed systems, etc) and slot your result into that domain. I don’t doubt that a truly descriptive probability space will be very high dimensional here but I’m confident we have the analytical and computational power to perform this work nonetheless.

                                                                                            The real challenge I suspect will be to gather the data. FOSS developers are time and money strapped as it is, and excluding some exceptional cases such as curl’s codebase statistics, they’re rarely going to have the time to take the detailed notes it would take to drive this research forward. Corporations which develop proprietary software have almost no incentive to release this data to the general public given how much it could expose about their internal organizational structure and coding practices, so rather than open themselves up to scrutiny they keep the data internal if they measure it at all. Combating this will be a tough problem.

                                                                                            1. 2

                                                                                              Yeah I don’t see any conflict there (and I’ve watched the first one before). I use both static and dynamic languages and there are advantages and disadvantages to each. I think any programmer should comfortable using both styles.

                                                                                              I think that the notion that a study is going to change anyone’s mind is silly, like “I am very productive in statically typed languages. But a study said that they are not more productive; therefore I will switch to dynamically typed”. That is very silly.

                                                                                              It’s also not a question that’s ever actionable in reality. Nobody says “Should I use a static or dynamic language for this project?” More likely you are working on existing codebase, OR you have a choice between say Python and Go. The difference between Python and Go would be a more interesting and accurate study, not static vs. dynamic. But you can’t do an “all pairs” comparison via scientific studies.

                                                                                              If there WERE a study definitely proving that say dynamic languages are “better” (whatever that means), and you chose Python over Go for that reason, that would be a huge mistake. It’s just not enough evidence; the languages are different for other reasons.

                                                                                              I think there is value to scientific studies on software engineering, but I think the field just moves very fast, and if you wait for science, you’ll be missing out on a lot of stuff. I try things based on what people who get things done do (e.g. OCaml), and incorporate it into my own work, and that seems like a good way of obtaining knowledge.

                                                                                              Likewise, I think “Is catching bugs earlier less expensive” is a pretty bad question. A better scientific question might be “is unit testing in Python more effective than integration testing Python with shell” or something like that. Even that’s sort of silly because the answer is “both”.

                                                                                              But my point is that these vague and general questions simply leave out a lot of subtlety of any particular situation, and can’t be answered in any useful way.

                                                                                              1. 2

                                                                                                I think that the notion that a study is going to change anyone’s mind is silly, like “I am very productive in statically typed languages. But a study said that they are not more productive; therefore I will switch to dynamically typed”. That is very silly.

                                                                                                While the example of static and dynamic typing is probably overbroad to be meaningless, I don’t actually think this would be true. It’s a bit like saying “Well I believe that Python is the best language and even though research shows that Go has propertries <x, y, and z> that are beneficial to my problem domain, well I’m going to ignore them and put a huge prior on my past experience.” It’s the state of the art right now; trust your gut and the guts of those you respect, not the other guts. If we can’t progress from here I would indeed be sad.

                                                                                                It’s also not a question that’s ever actionable in reality. Nobody says “Should I use a static or dynamic language for this project?” More likely you are working on existing codebase, OR you have a choice between say Python and Go. The difference between Python and Go would be a more interesting and accurate study, not static vs. dynamic. But you can’t do an “all pairs” comparison via scientific studies.

                                                                                                Sure, as you say, static vs dynamic languages isn’t very actionable but Python vs Go would be. And if I’m starting a new codebase, a new project, or a new company, it might be meaningful to have research that shows that, say, Python has a higher defect rate but an overall lower mean time to resolution of these defects. Prior experience with Go may trump benefits that Python has (in this synthetic example) if project time horizons are short, but if time horizons are long Go (again in the synthetic example) might look better. I think this sort of comparative analysis in defect rates, mean time to resolution, defect severity, and other attributes can be very useful.

                                                                                                Personally, I’m not satisfied by the state of the art of looking at builders. I think the industry really needs a more rigorous look at its assumptions and even if we never truly systematize and Fordify the field (which fwiw I don’t think is possible), I certainly think there’s a lot of progress for us to make yet and many pedestrian questions that we can answer that have no answers yet.

                                                                                                1. 2

                                                                                                  Sure, as you say, static vs dynamic languages isn’t very actionable but Python vs Go would be. And if I’m starting a new codebase, a new project, or a new company, it might be meaningful to have research that shows that, say, Python has a higher defect rate but an overall lower mean time to resolution of these defects.

                                                                                                  Python vs Go defect rates also seem to me to be far too general for an empirical study to produce actionable data.

                                                                                                  How do you quantify a “defect rate” in a way that’s relevant to my problem, for example? There are a ton of confounds: genre of software, timescale of development, size of team, composition of team, goals of the project, etc. How do I know that some empirical study comparing defect rates of Python vs. Go in, I dunno, the giant Google monorepo, is applicable to my context? Let’s say I’m trying to pick a language to write some AI research software, which will have a 2-person team, no monorepo or formalized code-review processes, a target 1-year timeframe to completion, and a primary metric of producing figures for a paper. Why would I expect the Google study to produce valid data for my decision-making?

                                                                                                2. 1

                                                                                                  Nobody says “Should I use a static or dynamic language for this project?”

                                                                                                  Somebody does. Somebody writes the first code on a new project and chose the language. Somebody sets the corporate policy on permissible languages. Would be amazing if even a tiny input to these choices were real instead of just perceived popularity and personal familiarity.

                                                                                          2. 2

                                                                                            I meant to comment on that original thread, because I thought the question was misguided. Well now that I look it’s actually been deleted?

                                                                                            Too many downvotes this month. ¯\_(ツ)_/¯

                                                                                            1. 1

                                                                                              This situation is not ideal :(

                                                                                          1. 6

                                                                                            I don’t need incentives like “carbon neutrality” or “global warming” to know that we must reduce our resource consumption, both energy and materials.

                                                                                            We need a culture of repair rather than buying something new, and people make it too easy for themselves in blaming the big corporations. Indeed, there should be legislation for producers and vendors to publish schematics and sell spare parts to allow repair, but it must also be within our spirit to nurture a culture of teaching repairing and keeping it in mind when faced with a defect.

                                                                                            Regarding computing, a big step would be to de-bloat the web (which would also make it faster). Lots of energy is wasted with that, and it would have a great overall benefit. I don’t think you can push it with legislation, but maybe as a new counterculture?

                                                                                            Streaming is another thing that consumes a lot of energy (because it’s not multicast and each stream-connection must be served individually). A very ambitious solution here are peer-to-peer-networks where the distance between nodes gets shorter, improving efficiency (and latency). If you consider personal devices and the fact that DRAM consumes energy if filled or not, I can imagine any handheld or desktop to become a data-node without much added energy consumption in each device.

                                                                                            1. 7

                                                                                              Regarding computing, a big step would be to de-bloat the web (which would also make it faster). Lots of energy is wasted with that, and it would have a great overall benefit. I don’t think you can push it with legislation, but maybe as a new counterculture?

                                                                                              This will do almost nothing. The entire internet is just 0.6% of the world’s emissions, and the vast vast majority of the internet is streaming content. Netflix by itself made up 15% of the total internet bandwidth in 2018. Website bloat doesn’t matter.

                                                                                              (You mention streaming after this, but it’s not the individual connections that matter, it’s the transfer of data that’s energy intensive.)

                                                                                              1. 2

                                                                                                Phones and tablets (which make up the majority of computing devices) really aren’t very repairable, beyond trivial things like cracked screens. Have you ever seen the insides of one?

                                                                                                DRAM consumes energy, but that’s an absolutely tiny quantity. Much more is used by the CPU, and it’s the display backlight that’s the great majority. Phones and tablets are very energy efficient compared to desktop computers with their GPUs glowing cherry-red in their helium-cooled enclosures.

                                                                                                1. 4

                                                                                                  Current phones are already able to be repaired, the current political debate is in response to manufacturers artificially constraining supply of spare parts by tweaking them just barely enough to break compatibility, then demanding exclusivity of the custom part from the manufacturers. That way, they can demand people pay for a $300 new phone instead of a $10 new chip plus $50 in labour costs.

                                                                                                  That said, making phones easier to repair and especially easier to replace the battery of, is something that isn’t done because it’s simply not a priority, not because it’s necessarily hard.

                                                                                              1. 2

                                                                                                The directory browser that ships with vim is not particularly intuitive and ships with a wealth of features I will most likely never use. I get the sense that many developers just blindly install a shiny plugin without understanding what netrw can do. Sure, netrw is not perfect but less dependencies in my life and striving for simplicity is a good thing.

                                                                                                It’s okay to use plugins if it’ll make your life better, even if you can manage to the same things in an unintuitive way without it.

                                                                                                1. 3

                                                                                                  Clearly, yes, but there’s also value in keeping your configuration small and easy-to-maintain. Also, in being able to use computers you haven’t extensively customized.

                                                                                                  I know that you lean heavily into “programming” your environment with AutoHotKey etc., and I know there are advantages to doing it that way. Me, I started out doing sysadmin stuff, so being able to use a “stock” computer was important to me - and I’m trying to keep maintenance work down so that I can do more “fun” stuff.

                                                                                                  1. 2

                                                                                                    Oh, totally agreed! There is significant value in keeping your setup minimalist, and I know I’ve given up a lot of good stuff by customizing so heavily. Whenever I have to SSH into an AWS instance it’s miserable. And I can only afford to customize so much because I’m independent and use only one computer.

                                                                                                    I mostly push back because a lot of online vim communities turn it into almost of a moral thing: ultraminimalism is The Right Way and heavily customizing is Wrong. You see this sentiment a lot on r/vim, esp with -romainl-, and to my understanding it’s also infected the vim IRC. The world is big enough for both of us.

                                                                                                    1. 1

                                                                                                      ultraminimalism is The Right Way and heavily customizing is Wrong

                                                                                                      Not specific to vim, but there is a big advantage in discouraging customisation among your core userbase: it encourages you to get the defaults right. This is a problem that a lot of open source programs suffer from: you end up with a load of customisation potential and all of the long-term users have a setup that is very different from new users. New users struggle to get the thing to work because they haven’t accumulated the massive set of configuration options that everyone who uses the program seriously has built up over the years.

                                                                                                      I keep reading about how amazing neovim is, but when I spent half an hour trying to follow some instructions from someone who had an (apparently) amazing config and gave up. It doesn’t matter to me that it can do all of these amazing things if I can’t figure out how to make it to do any of them.

                                                                                                      1. 1

                                                                                                        Isn’t that more of a second-order knowledge gathering (project level) and/or experience transfer problem (social level)? The developer has provided a set of features believed to be useful or at least interesting, yet does not follow up to learn how they are actually used or baked into a workflow as a step to provide presets that can be added to/subtracted from and taught. The users react to this by creating ad-hoc exchanges with low visibility/poor discovery/upstream desynchronisation and a lot of other problems.

                                                                                                1. 0

                                                                                                  What’s the point of this? Most of the unique features of zOS are only really useful or interesting if you’re running it on a mainframe, which this person isn’t doing. 90% of the blogpost is the person trying to get a copy of it in the first place, and talking about code licensing bullshit.

                                                                                                  I don’t see why anyone would go through this trouble except out of curiosity, but as far as I can tell for ‘normal’ use it’s basically just a unix box with some quirks, which along with the earlier licensing BS makes it seem like a lot of effort for very little gain – compare with running something like 9front where it’s a mostly unique system and you can acquire the entire thing for free without much effort.

                                                                                                  Can someone explain why this is useful / interesting to do?

                                                                                                  1. 2

                                                                                                    What’s the point of this?

                                                                                                    It makes the OP happy. What other justification does he need?

                                                                                                    1. 1

                                                                                                      Ok, that’s cool. But this is a guide on installing it and he doesn’t really give me a reason to do any of that. He said part of his reason for installing it is to pass the knowledge on to the next generation, but he just utterly fails to give any kind of reason on why this is worthwhile knowledge to pass on if you’re not working literally as a sysadmin on wallstreet

                                                                                                    2. 1

                                                                                                      They say why, it’s the first sentence

                                                                                                      Some people retire and buy an open top sports car or big motorbike. Up here in Orkney the weather can change every day, so instead of buying a fast car with an open top, when I retired, I got z/OS running on my laptop for the similar sort of price! This means I can continue “playing” with z/OS and MQ, and helping the next generation to use z/OS. At the end of this process I had the software installed on my laptop, many unwanted DVDs, and a lot of unnecessary cardboard

                                                                                                      1. 1

                                                                                                        Calling z/OS a “unix box with quirks” is underselling it extremely. It’s quite a bizarre OS branch people know little about, but that’s because IBM has no hobbyist program and you only see it if you’re basically MIS at a Fortune 500.

                                                                                                        I don’t think there’s too much other than licensing bullshit in the OP either (it’s thin otherwise); he’d be better off using literally anything z/PDT for hyucks.

                                                                                                        1. 1

                                                                                                          Calling z/OS a “unix box with quirks” is underselling it extremely.

                                                                                                          And yet neither this blogpost, nor the wikipage for the operating system does anything to disabuse me of this notion, and there doesn’t seem to be any feature of this that is useful for someone running it on something that isn’t a mainframe.

                                                                                                          I don’t think there’s too much other than licensing bullshit in the OP either (it’s thin otherwise); he’d be better off using literally anything z/PDT for hyucks.

                                                                                                          yup

                                                                                                      1. 16

                                                                                                        This is more of a console based program than a CLI, but VisiData is an essential part of my workflow when dealing with structural data like CSVs. So much more convenient than using a spreadsheet like Excel and very effecient.

                                                                                                        1. 4

                                                                                                          I’ve never seen VisiData before - it’s amazing! Thank you!

                                                                                                          1. 3

                                                                                                            Paul’s doing a Strange Loop workshop and I’m so sad I have to miss it due to conflicting obligations.

                                                                                                          1. 2

                                                                                                            Thank you for posting this, I’ve been having so much fun for the past week playing with it

                                                                                                            1. 2

                                                                                                              I’ve been writing a program to run batches on AWS. I hope to make finding useful adjectives (like unreal engine) easier to find if you can try ten candidates on ten different prompts and see if you get any patterns.