1. 4

    That’s an interesting point of view. Can “winners game” and “losers game” can even be applied to a non-competitive field such as programming? I guess that in this case, this depends mostly on how we measure the points.

    1. 1

      I was thinking the same thing. For this analogy to work, you need to assign points to a winning or losing category. And this is pretty arbitrary (ex. using N+M string concatenation might be a lost point, or completely neutral depending on the application).

      There are also plenty of competitive games (like soccer) where most points can’t be objectively assigned to the ‘winning’ or the ‘losing’ party.

      The article certainly raises more questions than it answers. But it seems most people on lobste.rs can relate to the author’s point of view.

    1. 9

      Oh, is this a first “mainstream”/high-profile company using Nix and doing a public “coming out” about this? Or did I miss some earlier ones?

      1. 7

        There was the retailer Target, but I wouldn’t say they had a public coming out - https://github.com/target/lorri.

        1. 3

          This was my thought as well! Now nix will become a docker! Yay =)

          1. 18

            I say this as somebody who loves Nix. But many people will struggle with Nix’ learning curve and hate it. For all Docker’s failings, do not underestimate how much it is loved because you can just stash your a bunch of Unix commands in your Dockerfile and that’s mostly it.

            My worry is Nix getting too much exposure before all the rough edges are smoothed out. Though of course, exposure also brings in new contributors, which may help with that.

            1. 7

              So far our tooling takes care of (almost) everything and the magic happens under the hood. The developers didn’t have to directly interact with Nix yet. We’ll see how things change once developers need to write their own configurations/derivations in Nix. The developer behind all this effort has also released a series of videos introducing Nix in case you’re interested! https://www.youtube.com/watch?v=NYyImy-lqaA&list=PLRGI9KQ3_HP_OFRG6R-p4iFgMSK1t5BHs

              1. 2

                The documentation situation is a little frustrating. There’s actually too much documentation. It’s easy to accidentally get obsolete or out-of-order information. For example, the nix installer is totally broken on MacOS Catalina because root directories are locked down and it can’t manage /nix the way it wants. There’s a work around, but it’s only documented in the ~2000th comment on a github issue…

                Yes, I intend to make a contribution to correct this, but:

                1. I have to figure out how to do that for this project, which I’m new to
                2. I have to confirm the solution I found is the temporary official one
                3. This has been broken for a year and a fix is coming “soon,” so is it even worth documenting the current process?
                1. 2
                  1. 1

                    This also works! https://dev.to/louy2/installing-nix-on-macos-catalina-2acb

                    curl -L https://raw.githubusercontent.com/NixOS/nix/d42ae78de9f92d60d0cd3db97216274946e8ba8e/scripts/create-darwin-volume.sh | sh
                    curl -L https://nixos.org/nix/install | sh
                    

                    Although relying on a script that is no longer on master is definitely not a good idea, but… /shrug

                2. 1

                  I strongly dislike docker containers and the whole ecosystem around dockerhub.

                  However, the syntax of Dockerfiles is extremely beautiful. Probably my favorite file format ever. It is 100% understandable by anybody who has never heard about docker. If you ignored docker at all, you could still run dockerfiles “by hand” to reproduce the systems they describe. Sadly, the same thing cannot at all be said for nix files, in which quite a lot of implicit things happen and they are punctuation-ridden for unclear reasons.

                  EDIT: I would love if a “lite”, less powerful, nix file format existed, where the graph is linearized and it has a dockerfile like feel (maybe using the exact same syntax).

                  1. 9

                    As I understand it, the Nix expression language is not an essential part of the Nix ecosystem. As long as you generate valid .drv files, you can theoretically use any language, and still manage your system with Nix. I believe (though I’d love it if someone can confirm) that guix, while using guile scheme for expressions, still outputs the same .drv format[1].

                    Another example I know of is dhall-nix, which allows you to translate Dhall expressions into Nix expressions[2].

                    So in theory, as long as your config language of choice can be mapped to the Nix expression language, or a subset of it which you wish to use, you may be able to write a translator which converts it to a Nix expression. Alternatively, you can write a compiler which outputs .drv files directly.

                    [1] https://guix.gnu.org/manual/en/html_node/Derivations.html

                    [2] http://www.haskellforall.com/2017/01/typed-nix-programming-using-dhall.html

                    1. 1

                      It really depends what you mean with essential. Without Nix the programming language, Nix is much closer to being yet another Hermetic build system.

                      I try to make that point more elaborately here :

                      https://blog.eigenvalue.net/nix-superglue-for-immutable-infrastructure/

                    2. 1

                      Without knowing a lot about nix, having a graph instead of a linear set of instructions makes caching much more powerful. I’ve sat through many painfully long Docker builds because the thing I had to change was coincidentally located early in the Dockerfile. If every single thing in your image is immutable and created in a static order, I would imagine that build times would be completely unmanageable.

              1. 2

                I’m going to give Julia a try. It looks pretty interesting. It’s statically typed, but you can still use a REPL, it’s supposed to be fast, and the syntax looks familiar enough to be easy to learn.

                1. 3

                  What’s special in Julia that you can’t achieve in other high level programming languages with types/objects?

                  1. 0

                    Not sure. But it’s like asking what’s special about any language? Aren’t they all Turing complete? Why not use brainfuck for everything. It’s turing complete.

                    1. 2

                      Well, you’re correct but also missing the point. The title makes me think that there’s some feature in Julia that would have caught this error ahead of time, which isn’t the case.

                      1. 2

                        This is a really interesting line of reasoning/argument. In particular, I think that Julia is pretty special because of the character of it having been designed especially for scientific/numerical computing. This has influenced a variety of language decisions and certainly the ecosystem.

                    1. 11
                      1. 4
                        1. 3

                          I had soooo much trouble installing that package.

                          My brain was convinced it was “ascii enema”.

                          1. 2

                            Also tee(1)

                            1. 1

                              Caution: Don’t play with this doing anything / listing anything that is private….

                              By default it squirts your terminal session into the cloud… (which, in theory, is private until you sign up and explicitly make it private)

                            2. 2

                              Use with the -t timings-file argument, then replay with scriptreplay.

                              This can be used to drive, oh, say, as a random example, phosphor(6x), to replay console sessions as a screensaver.

                              This can be actually useful for system set-up / configuration. If you’re accessing a new box over a serial interface (serial-over-IP, IPMI, ILOM, etc.), you can record the entire sequence including boot, BIOS, bootloader, and kernel boot of a new system install. You can also choose the playback speed to accelerate the replayed session.

                            1. 15

                              A lot happens in /proc of course

                              Valuable info includes:

                              • /proc/$pid/exe -> always gives you the executable, even if it is inside a docker container
                              • xargs -0 < /proc/$pid/env
                              • xargs -0 < /proc/$pid/cmdline
                              1. 3

                                procfs is one of my favourite things in Linux! I miss it so much when I have to use macOS.

                                1. 6

                                  My favouritest thing is strace….

                                  …which tells me a lot of things under the hood go frootling about in /proc….

                                  1. 3

                                    For example…. Consider the very very useful lsof (list open files)

                                    strace -e openat lsof

                                    or even
                                    

                                    strace -e openat ps

                                2. 2

                                  Also /proc/$pid/wchan to see quickly which syscall your code is up to

                                1. 4

                                  Familiarize yourself with composition: how Linux looks like in a disk image, a chroot, a container, etc. Things like Packer, mkosi, casync, welder, debootstrap, etc. This will lead into package management and servicing (updates, etc.)

                                  Then systemd. From Mattias [1], Enrico [2] or Lennart’s blog. You might want to follow the systemd releases page. You can use Debian or Fedora, but even if you use Debian I suggest you track the Fedora changelog.

                                  A good organized collection of readings is [3] and focused in internals.

                                  People are already giving you great eBPF resources. New titles are coming out. I would suggest you experiment with Sysdig’s Falco.

                                  I’ve also learned a thing or three from Julia Evans’ zines and blog posts [4] and I bought her prints. And in terms of actual books, consider Kerrisk’s “The Linux Programming Interface” and the latest Nemeth et al. “UNIX and Linux System Administration Handbook”

                                  I hope this helps. I’ve been using Linux for over 15 years, creating distros, packaging, operating large fleets of it, using it as a desktop and more. I’m surprised how much of it evolves and reinvents itself. It keeps me an eternal learner and a perennial novice!

                                  [1] https://ma.ttias.be/learning-systemd/ [2] https://www.enricozini.org/blog/2017/debian/systemd-01-intro/ [3] https://0xax.gitbooks.io/linux-insides/ [4] https://jvns.ca/

                                  1. 1

                                    wow, thank you so much for sharing this!

                                  1. 11

                                    Writing a simple shell is a hard way, but touches a lot of Unix concepts (syscalls, libc, forking/zombies, signals, pipes, file descriptors, ttys/sessions, etc) in a broad but hands-on way.

                                    1. 2

                                      That’s actually a good idea. I did a really simple shell many years ago. I’m pretty sure I can do something more complex now.

                                      1. 1

                                        This is a very good idea. We did this for our operating systems class at uni, and I thought it was a lot of fun. We also wrote a very simple file system implementation which had a C API much like POSIX (read, write, open, close, seek, iirc).

                                        Nowadays you could do the same in userland, and then maybe write a FUSE implementation around it and then maybe convert it into a kernel module later.

                                      1. 5

                                        I’m assuming you’re asking about using linux as an operating system and not understand the intricacies of the kernel.

                                        This may sound like an odd recommendation, but the simple act of compiling the tools you use from source is surprisingly effective at exposing you to large swathes of how linux systems work. I’ll leave the explanation of why that’s true as an exercise for the reader :)

                                        1. 1

                                          It’s more about the intrincates of the kernel (and some user space libraries to debug and trace the system)

                                        1. 1

                                          For clarification: Linux the kernel or Linux as an operating system (Linux kernel + userland)?

                                          1. 1

                                            Kernel + userland. I want to learn more about the kernel, but also about troubleshooting Linux problems, and how to use tools like ftrace, strace, eBPF, perf, etc.

                                          1. 9

                                            I’ve done this with teams before. Always regretted it. Flaky Tests should probably be called Sloppy Tests. They always point to a problem that no one wants to take the time to deal with.

                                            1. 2

                                              Flakiness isn’t always test flakiness. We also have infra flakiness (or intermittent infra failures) that are hard or impossible to solve. With respect to test retries, I somehow agree with you, but as always, this is a matter of tradeoffs: do you prefer a faster product development or an extremely reliable product with developers spending time trying to fix issues that are false positives most of the time?

                                              1. 1

                                                I haven’t tried this retry approach, but my gut reaction is to agree with you. Reading the article my first reaction was “why not just fix the flaky tests”?

                                                If the tests fail sporadically and often, how can you assume it’s the tests at fault and not the application code? And if it’s the latter, it’s affecting customers.

                                                1. 1

                                                  When new software is running on pre-production hardware, the line of delineation is not so easy to draw. Flaky tests could be one or the other, and filtering them out (based on the cause being one or the other) is not exactly straight forward.

                                                2. 1

                                                  It sounds bananas for GUI development, but it can make sense for statistical software, where some failures are to be expected. Maybe failures are unavoidable for some GUI environments? I can’t think why off the top of my head, though.

                                                  1. 1

                                                    The biggest difficulty is that flaky tests in end-to-end environments involve basically every part of the system, so any sort of non-determinism or race condition (timing is almost always at the core of these) can be involved. Thank god Javascript is single-threaded.

                                                    I once had a test fail intermittently for weeks before I realised that a really subtle CSS rule causing a 0.1s color fade would cause differences in results if the test was executing ‘too fast’

                                                  1. 2

                                                    Great post, I’m always interested in how companies deal with flakiness.

                                                    At Mozilla we attempted automatic retries, but we have so much flakiness that it was significantly impacting CI resources and we turned it back off. Instead, we let the next push act as the retry and have dedicated staff to monitor the CI and look for failure patterns (they can still manually retry if needed). There is also tooling to mute known intermittents from the UI, so developers have a better sense of whether or not they broke something.

                                                    Having people manually perform a task that could be automated is not a sexy solution, but it works fairly well and is probably the right trade-off for us in a cost benefit analysis.

                                                    1. 2

                                                      significantly impacting CI resources

                                                      We’ve seen that too, specifically for iOS, where we have more limited resources :(, but automatic retries are faster than developers looking for failures.

                                                      have dedicated staff to monitor the CI and look for failure patterns

                                                      I interned at Mozilla almost two years ago and I remember that there was a project to solve this. Sad to hear that it hasn’t been fully solved yet.

                                                      1. 1

                                                        You’re probably thinking of the autoclassify feature. That is being used and has reduced the amount of manual work sheriffs need to do. I don’t think it was ever intended to outright replace the sheriffs though.

                                                        Tbh, I’m glad Mozilla isn’t throwing crazy resources at the intermittent problem. We have a system that works pretty effectively and for a fraction of the cost it would take to automate. That’s not to say we’ll stop making incremental improvements. Maybe one day it will be fully automated, just not through massive spending and heroic effort.

                                                      2. 2

                                                        At FB, we retried 3 times, if it didn’t work we emailed the commit author. If a test was failing (with retries) on lots of diffs we would email the contact address for the feature under test and take the test out of rotation. Very infrequently, we would re-run the failed tests again, if one of them passed 50 times we’d put it back in the rotation (or if someone pushed a fix and manually re-enabled the test).

                                                        significantly impacting CI resources

                                                        Yes. We did notice that :D

                                                        1. 1

                                                          If a test was failing (with retries) on lots of diffs we would email the contact address for the feature under test and take the test out of rotation.

                                                          We do this as well. Every intermittent has an orangefactor score, which is a proxy for how much pain it causes everyone (basically just failure rate and frequency). Once an intermittent passes a certain threshold, the relevant test is disabled after a short grace period to get it fixed.

                                                      1. 2

                                                        I feel like monorepo is a hot topic the last weeks. At least in my bubble (called work) it is and it seems here and on the orange site as well.

                                                        Last year, we moved our Android apps and libraries to a monorepo and increased the size of our Android team.

                                                        Shopify is certainly not just an Android shop. So this move does not mean that everything moved into one repo. I also saw that fuzzyness at work where “Monorepo” was the project name for merging the repositories of two projects, leaving lots of other repos for themselves.

                                                        The term “Monorepo” has lost its literal meaning: Mono = Single as in “a single repo for the whole company”. Even Google does not have a single repo for the whole company because Android, Chrome, Go are not inside the big repository.

                                                        The term is probably not about size. I would assume we can find a small company which uses a single repo just like Google and Facebook, but the repo is smaller than a project repo somewhere else.

                                                        Any idea for a good definition?

                                                        One approach could be: In a common repo you will find some build system stuff at the root. In a monorepo you will only find folders and build system stuff in there. There is no “build everything” in a monorepo.

                                                        1. 2

                                                          I always took monorepo to mean per product, not per company.

                                                          1. 2

                                                            The definition of monorepo is definitely not clear. We don’t understand it as a single repository for the entire company, but a repository containing different projects that share code. Using this definition, we have two mobile monorepos: one for Android, and another one for iOS.

                                                          1. 4

                                                            We have a lot of open positions at Shopify mostly in Canada (Toronto, Ottawa, Montreal).

                                                            We use Ruby/Rails but also Go, Python… Great work environment and good perks!

                                                            1. 2

                                                              Nice to see remote is a possibility (I am in Mission BC & don’t want to move to KW ;). A former colleague left where I’m at for Shopify.

                                                            1. 1

                                                              At the same time, do you think if it would make sense for a company to develop this commercially?

                                                              Technology giants are so valuable because of their ability to develop correlation between users’ data, create insights about users which are helpful for them for generating business based on similarities of users. (Again, it will be much better for users’ to get a model which is locally optimized which is something Google has been adding to our phones keyboards)

                                                              1. 1

                                                                IMHO, this is not only about privacy but also about yielding the computation to a distributed network of devices. As you say, Google has tested it on Google Keyboard (https://ai.googleblog.com/2017/04/federated-learning-collaborative.html). And as far as I know, some other companies such as Mozilla are investigating this technique.

                                                              1. 1

                                                                I think most of the issues the author has could be solved by a sparse git checkout. Basically you only fetch one subdirectory of the whole got repo. It’s faster, plus allows for smaller checkouts.

                                                                There is definitely thruth in the objections though, a monorepo has disadvantages too. What I did was create the monorepo and do a split to commit to the individual repo’s again. It allowed us to see if it was a match for us without changing our tooling as the individual repos where being kept up to date too. It helped us a lot to PoC it fast and don’t waste too much time.

                                                                1. 1

                                                                  That would definitely help to improve the git performance if the repository’s history is too big, but there are other problems such as executing only the tests for checking the correctness of the changed parts that are hard to solve. Big companies using a monorepo model such as Facebook and Google have invested a lot of effort in designing tools (and even its own build system, see https://bazel.build/ and https://buckbuild.com/) for an efficient monorepo.

                                                                1. 4

                                                                  Hey all, some context on this. Shield studies are studies run by Mozilla to try new features in a random population (https://wiki.mozilla.org/Firefox/Shield/Shield_Studies). Here you can get some context on why and how, and it’s possible to see which ones are being executed and which ones are in the queue to be executed in the future (https://wiki.mozilla.org/Firefox/Shield/Shield_Studies/Queue). It’s also important to point that, even if you have the study installed, it doesn’t mean you’re sending data. Those studies usually only send data from 1%-2% of the population.

                                                                  Moreover, running this kind of studies is always optional. They can be disabled in about:preferences#privacy, unchecking the “Allow Firefox to install and run studies” checkbox. And it’s also possible to see more information about the studies in about:studies#shieldStudies.

                                                                  If you really, really, want to see what data is sent by Firefox (Telemetry data, health data, Shield studies data…), it’s possible to go to about:telemetry and filter by type, see archived pings, and the raw JSON that is sent.

                                                                  1. 10

                                                                    There are several main problems with this.

                                                                    1. Optional does not mean opt-out, it means opt-in. You want to collect data from loyal Mozilla fans, then by all means give them the ability to turn it ON.

                                                                    2. If #1 is unavoidable, don’t be unprofessional and don’t do mysterious things with the power that you grabbed from the opt-out default. “MY REALITY IS DIFFERENT THAN YOURS” is like one of the worst things you can put in the description that loyal unexpecting users will see.

                                                                    3. If #2 happens by accident, write an apology and clean up your act. The replies from Mozilla thus far have been “it’s shield studies”. This is so cold and tone-deaf. Tell us what you’re going to do to make it better and make sure we can still trust Firefox!

                                                                    1. 2

                                                                      I agree with you, but tbh, I don’t understand the problem with sharing anonymous data that can help to improve a product you use every day. If users have to give explicit consent to share even the most basic data, Mozilla would never be able to understand how people use Firefox. Don’t misunderstand me, I really care about my privacy and I don’t want my data to be sold or used to show me ads, as other big companies do, but Mozilla’s policy on data privacy (https://www.mozilla.org/en-US/privacy/principles/) is very strict with that.

                                                                    2. 1

                                                                      Also, about:studies shows which studies are ongoing, and a link to prefs to change this.