1. 4
      1. 3

        I had soooo much trouble installing that package.

        My brain was convinced it was “ascii enema”.

        1. 2

          Also tee(1)

          1. 1

            Caution: Don’t play with this doing anything / listing anything that is private….

            By default it squirts your terminal session into the cloud… (which, in theory, is private until you sign up and explicitly make it private)

          2. 2

            Use with the -t timings-file argument, then replay with scriptreplay.

            This can be used to drive, oh, say, as a random example, phosphor(6x), to replay console sessions as a screensaver.

            This can be actually useful for system set-up / configuration. If you’re accessing a new box over a serial interface (serial-over-IP, IPMI, ILOM, etc.), you can record the entire sequence including boot, BIOS, bootloader, and kernel boot of a new system install. You can also choose the playback speed to accelerate the replayed session.

          1. 15

            A lot happens in /proc of course

            Valuable info includes:

            • /proc/$pid/exe -> always gives you the executable, even if it is inside a docker container
            • xargs -0 < /proc/$pid/env
            • xargs -0 < /proc/$pid/cmdline
            1. 3

              procfs is one of my favourite things in Linux! I miss it so much when I have to use macOS.

              1. 6

                My favouritest thing is strace….

                …which tells me a lot of things under the hood go frootling about in /proc….

                1. 3

                  For example…. Consider the very very useful lsof (list open files)

                  strace -e openat lsof

                  or even

                  strace -e openat ps

              2. 2

                Also /proc/$pid/wchan to see quickly which syscall your code is up to

              1. 4

                Familiarize yourself with composition: how Linux looks like in a disk image, a chroot, a container, etc. Things like Packer, mkosi, casync, welder, debootstrap, etc. This will lead into package management and servicing (updates, etc.)

                Then systemd. From Mattias [1], Enrico [2] or Lennart’s blog. You might want to follow the systemd releases page. You can use Debian or Fedora, but even if you use Debian I suggest you track the Fedora changelog.

                A good organized collection of readings is [3] and focused in internals.

                People are already giving you great eBPF resources. New titles are coming out. I would suggest you experiment with Sysdig’s Falco.

                I’ve also learned a thing or three from Julia Evans’ zines and blog posts [4] and I bought her prints. And in terms of actual books, consider Kerrisk’s “The Linux Programming Interface” and the latest Nemeth et al. “UNIX and Linux System Administration Handbook”

                I hope this helps. I’ve been using Linux for over 15 years, creating distros, packaging, operating large fleets of it, using it as a desktop and more. I’m surprised how much of it evolves and reinvents itself. It keeps me an eternal learner and a perennial novice!

                [1] https://ma.ttias.be/learning-systemd/ [2] https://www.enricozini.org/blog/2017/debian/systemd-01-intro/ [3] https://0xax.gitbooks.io/linux-insides/ [4] https://jvns.ca/

                1. 1

                  wow, thank you so much for sharing this!

                1. 11

                  Writing a simple shell is a hard way, but touches a lot of Unix concepts (syscalls, libc, forking/zombies, signals, pipes, file descriptors, ttys/sessions, etc) in a broad but hands-on way.

                  1. 2

                    That’s actually a good idea. I did a really simple shell many years ago. I’m pretty sure I can do something more complex now.

                    1. 1

                      This is a very good idea. We did this for our operating systems class at uni, and I thought it was a lot of fun. We also wrote a very simple file system implementation which had a C API much like POSIX (read, write, open, close, seek, iirc).

                      Nowadays you could do the same in userland, and then maybe write a FUSE implementation around it and then maybe convert it into a kernel module later.

                    1. 5

                      I’m assuming you’re asking about using linux as an operating system and not understand the intricacies of the kernel.

                      This may sound like an odd recommendation, but the simple act of compiling the tools you use from source is surprisingly effective at exposing you to large swathes of how linux systems work. I’ll leave the explanation of why that’s true as an exercise for the reader :)

                      1. 1

                        It’s more about the intrincates of the kernel (and some user space libraries to debug and trace the system)

                      1. 1

                        For clarification: Linux the kernel or Linux as an operating system (Linux kernel + userland)?

                        1. 1

                          Kernel + userland. I want to learn more about the kernel, but also about troubleshooting Linux problems, and how to use tools like ftrace, strace, eBPF, perf, etc.

                        1. 9

                          I’ve done this with teams before. Always regretted it. Flaky Tests should probably be called Sloppy Tests. They always point to a problem that no one wants to take the time to deal with.

                          1. 2

                            Flakiness isn’t always test flakiness. We also have infra flakiness (or intermittent infra failures) that are hard or impossible to solve. With respect to test retries, I somehow agree with you, but as always, this is a matter of tradeoffs: do you prefer a faster product development or an extremely reliable product with developers spending time trying to fix issues that are false positives most of the time?

                            1. 1

                              I haven’t tried this retry approach, but my gut reaction is to agree with you. Reading the article my first reaction was “why not just fix the flaky tests”?

                              If the tests fail sporadically and often, how can you assume it’s the tests at fault and not the application code? And if it’s the latter, it’s affecting customers.

                              1. 1

                                When new software is running on pre-production hardware, the line of delineation is not so easy to draw. Flaky tests could be one or the other, and filtering them out (based on the cause being one or the other) is not exactly straight forward.

                              2. 1

                                It sounds bananas for GUI development, but it can make sense for statistical software, where some failures are to be expected. Maybe failures are unavoidable for some GUI environments? I can’t think why off the top of my head, though.

                                1. 1

                                  The biggest difficulty is that flaky tests in end-to-end environments involve basically every part of the system, so any sort of non-determinism or race condition (timing is almost always at the core of these) can be involved. Thank god Javascript is single-threaded.

                                  I once had a test fail intermittently for weeks before I realised that a really subtle CSS rule causing a 0.1s color fade would cause differences in results if the test was executing ‘too fast’

                                1. 2

                                  Great post, I’m always interested in how companies deal with flakiness.

                                  At Mozilla we attempted automatic retries, but we have so much flakiness that it was significantly impacting CI resources and we turned it back off. Instead, we let the next push act as the retry and have dedicated staff to monitor the CI and look for failure patterns (they can still manually retry if needed). There is also tooling to mute known intermittents from the UI, so developers have a better sense of whether or not they broke something.

                                  Having people manually perform a task that could be automated is not a sexy solution, but it works fairly well and is probably the right trade-off for us in a cost benefit analysis.

                                  1. 2

                                    significantly impacting CI resources

                                    We’ve seen that too, specifically for iOS, where we have more limited resources :(, but automatic retries are faster than developers looking for failures.

                                    have dedicated staff to monitor the CI and look for failure patterns

                                    I interned at Mozilla almost two years ago and I remember that there was a project to solve this. Sad to hear that it hasn’t been fully solved yet.

                                    1. 1

                                      You’re probably thinking of the autoclassify feature. That is being used and has reduced the amount of manual work sheriffs need to do. I don’t think it was ever intended to outright replace the sheriffs though.

                                      Tbh, I’m glad Mozilla isn’t throwing crazy resources at the intermittent problem. We have a system that works pretty effectively and for a fraction of the cost it would take to automate. That’s not to say we’ll stop making incremental improvements. Maybe one day it will be fully automated, just not through massive spending and heroic effort.

                                    2. 2

                                      At FB, we retried 3 times, if it didn’t work we emailed the commit author. If a test was failing (with retries) on lots of diffs we would email the contact address for the feature under test and take the test out of rotation. Very infrequently, we would re-run the failed tests again, if one of them passed 50 times we’d put it back in the rotation (or if someone pushed a fix and manually re-enabled the test).

                                      significantly impacting CI resources

                                      Yes. We did notice that :D

                                      1. 1

                                        If a test was failing (with retries) on lots of diffs we would email the contact address for the feature under test and take the test out of rotation.

                                        We do this as well. Every intermittent has an orangefactor score, which is a proxy for how much pain it causes everyone (basically just failure rate and frequency). Once an intermittent passes a certain threshold, the relevant test is disabled after a short grace period to get it fixed.

                                    1. 2

                                      I feel like monorepo is a hot topic the last weeks. At least in my bubble (called work) it is and it seems here and on the orange site as well.

                                      Last year, we moved our Android apps and libraries to a monorepo and increased the size of our Android team.

                                      Shopify is certainly not just an Android shop. So this move does not mean that everything moved into one repo. I also saw that fuzzyness at work where “Monorepo” was the project name for merging the repositories of two projects, leaving lots of other repos for themselves.

                                      The term “Monorepo” has lost its literal meaning: Mono = Single as in “a single repo for the whole company”. Even Google does not have a single repo for the whole company because Android, Chrome, Go are not inside the big repository.

                                      The term is probably not about size. I would assume we can find a small company which uses a single repo just like Google and Facebook, but the repo is smaller than a project repo somewhere else.

                                      Any idea for a good definition?

                                      One approach could be: In a common repo you will find some build system stuff at the root. In a monorepo you will only find folders and build system stuff in there. There is no “build everything” in a monorepo.

                                      1. 2

                                        I always took monorepo to mean per product, not per company.

                                        1. 2

                                          The definition of monorepo is definitely not clear. We don’t understand it as a single repository for the entire company, but a repository containing different projects that share code. Using this definition, we have two mobile monorepos: one for Android, and another one for iOS.

                                        1. 4

                                          We have a lot of open positions at Shopify mostly in Canada (Toronto, Ottawa, Montreal).

                                          We use Ruby/Rails but also Go, Python… Great work environment and good perks!

                                          1. 2

                                            Nice to see remote is a possibility (I am in Mission BC & don’t want to move to KW ;). A former colleague left where I’m at for Shopify.

                                          1. 1

                                            At the same time, do you think if it would make sense for a company to develop this commercially?

                                            Technology giants are so valuable because of their ability to develop correlation between users’ data, create insights about users which are helpful for them for generating business based on similarities of users. (Again, it will be much better for users’ to get a model which is locally optimized which is something Google has been adding to our phones keyboards)

                                            1. 1

                                              IMHO, this is not only about privacy but also about yielding the computation to a distributed network of devices. As you say, Google has tested it on Google Keyboard (https://ai.googleblog.com/2017/04/federated-learning-collaborative.html). And as far as I know, some other companies such as Mozilla are investigating this technique.

                                            1. 1

                                              I think most of the issues the author has could be solved by a sparse git checkout. Basically you only fetch one subdirectory of the whole got repo. It’s faster, plus allows for smaller checkouts.

                                              There is definitely thruth in the objections though, a monorepo has disadvantages too. What I did was create the monorepo and do a split to commit to the individual repo’s again. It allowed us to see if it was a match for us without changing our tooling as the individual repos where being kept up to date too. It helped us a lot to PoC it fast and don’t waste too much time.

                                              1. 1

                                                That would definitely help to improve the git performance if the repository’s history is too big, but there are other problems such as executing only the tests for checking the correctness of the changed parts that are hard to solve. Big companies using a monorepo model such as Facebook and Google have invested a lot of effort in designing tools (and even its own build system, see https://bazel.build/ and https://buckbuild.com/) for an efficient monorepo.

                                              1. 4

                                                Hey all, some context on this. Shield studies are studies run by Mozilla to try new features in a random population (https://wiki.mozilla.org/Firefox/Shield/Shield_Studies). Here you can get some context on why and how, and it’s possible to see which ones are being executed and which ones are in the queue to be executed in the future (https://wiki.mozilla.org/Firefox/Shield/Shield_Studies/Queue). It’s also important to point that, even if you have the study installed, it doesn’t mean you’re sending data. Those studies usually only send data from 1%-2% of the population.

                                                Moreover, running this kind of studies is always optional. They can be disabled in about:preferences#privacy, unchecking the “Allow Firefox to install and run studies” checkbox. And it’s also possible to see more information about the studies in about:studies#shieldStudies.

                                                If you really, really, want to see what data is sent by Firefox (Telemetry data, health data, Shield studies data…), it’s possible to go to about:telemetry and filter by type, see archived pings, and the raw JSON that is sent.

                                                1. 10

                                                  There are several main problems with this.

                                                  1. Optional does not mean opt-out, it means opt-in. You want to collect data from loyal Mozilla fans, then by all means give them the ability to turn it ON.

                                                  2. If #1 is unavoidable, don’t be unprofessional and don’t do mysterious things with the power that you grabbed from the opt-out default. “MY REALITY IS DIFFERENT THAN YOURS” is like one of the worst things you can put in the description that loyal unexpecting users will see.

                                                  3. If #2 happens by accident, write an apology and clean up your act. The replies from Mozilla thus far have been “it’s shield studies”. This is so cold and tone-deaf. Tell us what you’re going to do to make it better and make sure we can still trust Firefox!

                                                  1. 2

                                                    I agree with you, but tbh, I don’t understand the problem with sharing anonymous data that can help to improve a product you use every day. If users have to give explicit consent to share even the most basic data, Mozilla would never be able to understand how people use Firefox. Don’t misunderstand me, I really care about my privacy and I don’t want my data to be sold or used to show me ads, as other big companies do, but Mozilla’s policy on data privacy (https://www.mozilla.org/en-US/privacy/principles/) is very strict with that.

                                                  2. 1

                                                    Also, about:studies shows which studies are ongoing, and a link to prefs to change this.