Threads for ashishb

  1. 1

    A docker-format container does not (always) require docker desktop to be running. On a Linux host, you can use Podman which is daemonless, and supports user-invoked (unprivileged) containers too. Yes via docker desktop you get windows and mac coverage. Container UX is clunky for CLI apps, IMHO.

    1.  

      Padman is better but it is still less popular than Docker Desktop. Further, Podman requires a VM engine to be installed afaik.

    1. 6

      I’d suggest that auto-update is asking too much–have a functionality to check if updates are available, but leave the sysadmin work to the users and the ecosystem managers. If you do that, then you can just throw up a pile of binaries under Github releases or whatever.

      1. 2

        So is it ok for the binary to phone home regularly to check for update? I thought most people would be uncomfortable with that.

        1. 4

          Oh, no no no, not unless enabled explicitly. I mean, have a --check-updates argument or whatever so I as a user can script it or check myself.

          1. 2

            Fortunately, most distros will remove or disable your phone-home code.

            1. 4

              I’m not planning to phone home. I want to give users an option to easily upgrade e.g. brew upgrade or pip upgrade.

            2. 1

              i do this for some internal work tooling… but it doesn’t “phone home” it just hits an API for where my binaries are uploaded…

              if you have releases in github, take the example of casey’s excellent just:

              ❯ curl -L -s https://api.github.com/repos/casey/just/releases\?page\=1\&per_page\=1 | jq '.[0].name' -r
              1.13.0
              

              in my tool i take this output and compare it to the tool’s current version and output a message if it’s older (“hey, a new update is available”)

              of course i fail gracefully and quickly if connectivity isn’t there (short timeout)

              i wouldn’t call that “phoning home”

              1. 2

                i wouldn’t call that “phoning home”

                Users would - it’s still giving you an idea of how many people are using it and from where (source IP address), and you could start shipping extra information in the URL if and when you please. But if it’s just for work, who cares.

          1. 2

            Triggering steps based on manually specified globs makes me wince a bit, but it’s unclear what people should do instead. Bazel is the “proper” solution but it’s cruel to recommend that to someone who does not have a build team to look after it. Even once you have Bazel, you need more plumbing to get conditional CI jobs.

            Is there a good middle ground out there?

            1. 2

              This is why Earthly was created, to give a more accessible version of proper caching, isolation and build parallelism but in a more approachable manner than Bazel.

              https://earthly.dev/

              If you are working in Python, Pants is also worth a look.

              1. 1

                https://earthly.dev/

                Too hard to compete with GitHub Actions Marketplace at this time. I don’t want to rewrite custom plugins myself.

              2. 1

                Triggering steps based on manually specified globs makes me wince a bit, Why? Should Python linter run when the Go code changes? Or should the TS linter run when the Rust code changes?

                1. 1

                  Because it’s something the build tool can calculate for you. Anything worth its salt can tell you exactly which files it used once you have a complete product, so the sensitivity list could be generated automatically. This is a classic strategy for traditional Makefiles when compiling C.

                  1. 1

                    Which build tool will tell you which files will say pylint access?

                    1. 1

                      tup

              1. 17

                As a user, I don’t want auto-updates. I’d rather the package manager did it or I did it manually. For most command line tools I’m happy to manually download binaries from the project website or github releases if my package manager doesn’t have an acceptable version.

                For getting into package repos, just use some language with a clear standard toolchain for packaging binaries, use that toolchain and politely email the repo maintainers. Rust and Go have obvious toolchains that produce static binaries. If you use unusual dependencies then it could still be a pain to get into package repos because the maintainers will often want to package up the libraries separately from your tool (which is a bit shit, because then you don’t get control over what exact version your dependencies are).

                Edit: if your tool requires regular updates to continue to work (e.g. it depends on some internet service whose API changes regularly) then a warning that I should update when I run it is fine if it has to connect to the internet anyway. Ideally, the tool shouldn’t talk to your server unless it has to.

                1. 6

                  Just to be clear, I don’t want magical auto-updates either. I don’t want to push auto-updates. What I meant was the ability for users to update via their package manager e.g. brew upgrade

                  1. 5

                    The question really boils down to a separation of concerns. In most non-Windows ecosystems (including Homebrew), author and packager are separate roles. The packager knows the details of the target platform and the packaging system, the author knows the details of the program. They might be the same person. More commonly, they are the same person for some platforms, but not others. As an author, you can make packagers happy by doing a few things:

                    • Make it easy to mechanically discover dependencies. If your language ecosystem has a standard way of doing this, use it. Otherwise, at least provide a clear list of them. The packager will typically want to use packaged versions, or at least register dependencies for auditing.
                    • Don’t require an internet connection during the build. Secure package-build infrastructure is sandboxed.
                    • Use a build system that other people build. If you use CMake, for example, it’s one line for me to tell the FreeBSD ports system how to build, and it will build with Ninja by default and so scale well on the build cluster. If you write custom scripts, it’s more work.
                    • Use clean platform abstractions. If someone wants to support a different OS, they shouldn’t need to go and find everywhere where you’ve written ‘if Linux’ and then figure out that what you meant was ‘if not MacOS’.
                    • Put at many platforms and architectures in CI as you can. Even if you don’t test on the ones I care about, testing on more means it’s more likely to work for me.
                    • Provide clear build instructions. Don’t make me guess what some environment variable needs to be.
                    • If you autodetect dependencies, make it possible to opt out and specify them explicitly. When packaging, you want to be clear about everything that’s used.
                    1. 1

                      Makes sense. One cross-platform option that’s a bit of a hack is to distribute your binaries on npm: https://blog.xendit.engineer/how-we-repurposed-npm-to-publish-and-distribute-our-go-binaries-for-internal-cli-23981b80911b

                      It’s very convenient if your audience is likely to have npm or yarn or whatever installed.

                  1. 4

                    With regard to option 2, getting your package into some package repos for a few major distributions does give you roughly this (modulo the limitations of each distro). I (at least notionally) try to target Gentoo, Debian, Ubuntu (specifically, though if you get into Debian properly then you will eventually be ported into Ubuntu), and Arch Linux when I’m not developing something tied to some language or environment’s package manager. But you might add one each of the Mac and Windows package management tools as well.

                    Edit: doing this (and also specifically structuring your program so that it is easily packageable), and providing .deb’s for direct download, means that distro maintainers for other distros will tend to be pretty willing to do the ‘last mile’ packaging, getting you fairly complete coverage.

                    1. 3

                      Let’s assume I have a way of generating arch-specific static binaries

                      Any easy way to automate publishing the packages into

                      1. homebrew
                      2. Ubuntu apt-get
                      3. Chocolatey (Windows)
                      1. 2

                        cargo-dist might be able to do that in future.

                        1. 1

                          So, usually you would work out how to package the project for all of these as part of your CI (rather than ‘just generating a static binary’), and then on a tag release automatically push the generated artifact to the relevant package manager. Eg I’ve seen people use GitHub Actions to wrap the static binary in the wrapper that Chocolatey needs and then push it to Chocolatey.

                          But the exact ‘how’ depends on the details of your whole thing. Eg for packaging Rust things for Debian it is actually a lot easier than that, you typically wouldn’t compile a static binary, you only need a debian directory in the root of your Rust project with a correctly formatted copyright file and a debcargo.toml file, which are processed automatically and compiled and distributed by the Debian infrastructure once you have registered your package with the Debian rust packaging team. Similar for Gentoo except you need a gentoo directory with an ebuild file, and distributing binary packages requires a bit more setup on your end instead of being completely automatic on the distro infrastructure end.

                          Basically, you do need to learn a bit about ‘the maintainer life’ across the major platforms you want to release on, but the upside is that you get those nice ‘native OS ergonomics’.

                      1. 2

                        All of the suggested fixes come with big caveats:

                        1. @jclulow already covered caching
                        2. Cancelling stale executions is a problem if you want cleanup to run after each job. AFAIK cancelling just drops those if: always() jobs on the floor, possibly leaving you with a mess which needs to be cleaned up manually. Tearing down a few thousand stale S3 buckets gets pretty tedious, even with scripting.
                        3. Path filtering is harder than the article suggests. You need to remember to run the linter on all the files when the configuration changes or you upgrade the linter, for example.
                        4. Timeouts are a pain to configure properly (and re-configure ad nauseam). Putting a limit at 2x the average will probably break a small but significant number of otherwise passing jobs. And unless you’ve stopped working on a project the runtime will keep changing over the lifetime of the project.
                        1. 2

                          Tearing down a few thousand stale S3 buckets gets pretty tedious, even with scripting.

                          Yeah. Should most tests be writing to outside state e.g. S3 buckets? I don’t think so. As always, I never said that these rules are to be followed blindly by anyone.

                          Path filtering is harder than the article suggests. You need to remember to run the linter on all the files when the configuration changes or you upgrade the linter, for example.

                          I agree that it is non-trivial. I disagree that it is hard. Here’s a sample for a Python repo of mine. The efficiency gains are worth it. I have seen repos where Python tests run every time someone changes the README!

                          Putting a limit at 2x the average will probably break a small but significant number of otherwise passing jobs. Yeah, it depends on a lot of cases. 2X is my rule of thumb. In fact, if the tests are slowing up over the lifetime of the project, that’s worth investigating on its own, so, it is actually better for tests to fail and for someone to ask “Hey, how did these 30 mins tests now take over 1 hour?”

                        1. 14

                          There are many criticisms I would level at GHA, but not discarding the entire build environment between runs is not one of them. Preserving random bits of the build user’s home directory between runs (“dependency caching”) may speed up some CI jobs, but it will also mean you don’t catch certain errors and your job output necessarily depends on what jobs have run before in some way. It’s not a safe default everybody can just switch on without understanding what it means.

                          It’s also a feat of mental gymnastics to suggest that no timeout is a poor default for new jobs, and then suggest that the way to come up with a good timeout is to use the average runtime of past runs of the job you’re creating!

                          1. 2

                            I think the right way of doing dependency caching is with a container. This guarantees a base environment that has only the dependencies that you expect. For GitHub, you can use the same container as the base layer of a dev container (built tools in the CI container, any additional user-facing tooling in the dev container), so your contributors can have precisely that environment for Code Spaces.

                            1. 2

                              Most languages have become good at not giving access to dependencies that you didn’t explicitly install.

                              You are right that dependency caching can lead to bugs and if that’s a bigger concern than developer velocity it is best to not enable it.

                              No default timeouts is indeed wrong. A few badly running jobs consumed all my minutes! And that’s how I learned about it.

                              1. 1

                                i agree while heartedly about dependency caching, everytime I’ve added it to a CI service that wasn’t Elixir+hex I’ve regretted it.

                              1. 2

                                Half truth. Let someone give a really fast ripgrep replacement in typescript and I’ll agree.

                                1. 6

                                  Before ripgrep there was ack. It is fast and written in Perl. It never had the same hype as ripgrep, but was always a “better” grep.

                                  1. 4

                                    I used ack. I didn’t feel that they were as fast as rg.

                                    1. 1

                                      And between the two, there was ag, written in… C.

                                      1. 1

                                        Not as fast, but fast enough and more importantly better defaults than plain grep

                                        1. 4

                                          Right; I’ve never once in my life written a program where the difference in speed between ack and ripgrep would have been noticeable by the end user.

                                          There’s plenty of correctness-based reasons to prefer Rust over Perl, but for an I/O-bound program, the speed argument is very rarely relevant

                                          1. 7

                                            The difference of 8s is easily noticeable:

                                            $ time sh -c 'rg test linux-6.2.8 | wc -l'
                                            113187
                                            ________________________________________________________
                                            Executed in   13,69 secs    fish           external
                                               usr time    1,61 secs  781,00 micros    1,61 secs
                                               sys time    4,86 secs  218,00 micros    4,86 secs
                                            
                                            $ time sh -c 'ack test linux-6.2.8 | wc -l'
                                            113429
                                            ________________________________________________________
                                            Executed in   21,07 secs    fish           external
                                               usr time   12,82 secs    1,13 millis   12,81 secs
                                               sys time    4,25 secs    0,00 millis    4,25 secs
                                            
                                            1. 1

                                              I dunno… I abuse ripgrep enough in enough various situations that I would probably notice if it got slower. But that also might lead to me being more careful about my workflow, instead of shoving gigabytes of data into it and saying “lol cpu go brrrr”.

                                    1. 0

                                      so, do we finally have a Mac book replacement?

                                      1. 2

                                        No

                                      1. 1

                                        I search on GitHub. Not the best technique but better than Google search in terms of cutting down all the listicles that make it harder to find the actual tool or library.

                                        1. 2

                                          Like it or hate it, the reason users like Google’s AMP is because the normal web has become slow and bulky to use.

                                          1. 41

                                            Where can we read about those users that like Google’s AMP?

                                            1. 14

                                              AMP is a cancer on the modern web that spreads to everyone who copy-pastes AMP URLs. It’s mostly downside from both a user experience and technical perspective.

                                              Downsides to normal users:

                                              • URLs don’t mean anything anymore, so sharing pages with friends frequently involves copy pasting a huge AMP URL. Google frames this as an upside, that URLs shouldn’t mean anything anymore and that you should trust Google on this. The actual user experience of sharing an AMP URL with someone who then clicks that behemoth of a URL on desktop begs to differ.
                                              • More Google tracking and opportunities to serve ads
                                              • More opportunities for phishing and related attacks

                                              Downsides to the web:

                                              • More centralization of the web in Google
                                              • Fewer incentives to actually fix pageload times

                                              Upsides:

                                              • Faster pageloads by virtue of the request going through Google
                                              • Reader mode! Wait, Firefox already does this clientside.

                                              I’ve frequently thought about writing a little tool for integration into some chatbots to remove everything extraneous from pasted URLs, starting with AMP and possibly including garbage like the fbclid parameter for Facebook tracking, Google Analytics spyware query params, the new Chrome deep-linking URL fragments, imgur trying to serve you something other than an image, and so on. Unfortunately, this trend of bait-and-switch on URLs to serve more ads has become depressingly common. If anyone can point me to something that already does this, I’d love to hear about it.

                                              Edit: Found ClearURLs. Perfect.

                                              1. 13

                                                Users don’t “like” AMP. Users like fast, snappy content–it is the basest trick of marketing and developer evangelism that has conflated the two in an attempt to further fill the GOOG moat.

                                              1. 5

                                                On my wishlist: A way to block all the bloody “Subscribe to my spiffy mailinglist”-popups that has infested the web.

                                                1. 2

                                                  Big same. I was working on a browser plugin to turn position:fixed/etc elements into display:none, but it ran into a wall of

                                                  1. literally the first wild website I tested it on hit an infinite loop
                                                  2. javascript permission errors when trying to introspect style sheets

                                                  I suspect dealing with it robustly would require hacking up the browser renderer itself.

                                                  1. 2

                                                    The No, Thanks extension gets rid of some of them. Enough that I’m willing to pay its subscription fee because those stupid things make my blood boil, but it still misses a bunch.

                                                    1. 1

                                                      Thanks, I’ll give it a spin.

                                                    2. 1

                                                      The unfortunate reality is that they work. I remember reading, I think, Andrew Chen (A16Z) who mentioned that he feels sorry for these popups but he has to keep them on his blog since they work.

                                                      1. 3

                                                        Andrew Chen doesn’t have to have these annoying popups on his blog, he could perfectly well choose to have a button or a link. Truth is that he chose the annoying popups because he values the number of subscriptions more than the wellbeing of his audience.

                                                        1. 1

                                                          Do you have the source / data for the that? I’m not even sure how you’d measure how well they work. I assume you’d have to do some A/B testing, but while you can measure the number of people who sign up for your newsletter, and possibly even track whether the emails cause them to come back to your blog, you can’t measure the people who are unimpressed or get annoyed and don’t come back or recommend your blog to others.

                                                      1. 5

                                                        To be that person, I have no trust in a project that explicitly calls out the “nonsense that is the urbit project” without at all mentioning the nonsense that is it’s ideological foundation in far right reactionary thought.

                                                        1. 3

                                                          far right reactionary thought

                                                          Wouldn’t reactionary thought eschew solutions using technology at their core? I think they identify as neoreactionary for this reason…

                                                          1. 2

                                                            Exactly. And the first two paragraphs don’t even tell me what exactly this project is about.

                                                          1. 2

                                                            This explains why I am getting a deluge of small low quality PRs on one of my Open source repo. The repo isn’t even code but just a list of things.

                                                            1. 3

                                                              No one likes YAML but it survives. I am definitely amused at such cryptic languages which thrives despite being kludgy.

                                                              1. 9

                                                                No one likes YAML but it survives. I am definitely amused at such cryptic languages which thrives despite being kludgy.

                                                                “Good things come to an end, bad things have to be stopped.” ― Kim Newman

                                                              1. 1

                                                                A commit to turn a flag on?

                                                                1. 2

                                                                  You are storing flag values somewhere. If you are not using git commit to turn them on then you must have a different full-fledged system to record when the flag was modified, by whom, and who approved it.

                                                                1. 3

                                                                  Please make sure that it is three-step approach, you need to go back and remove the old branches eventually. As long as the other branches exist they complicate future changes.

                                                                  1. 1

                                                                    Good point. You are right, in some cases, it is three-step. While writing this I implicitly assumed the deletion of old code.

                                                                  1. 0

                                                                    There is only one GNU/Linux distro that was remotely comparable to the Mac OS user experience. Alas. No more.

                                                                    1. 1

                                                                      This scripting seems to replicate features of a build system. I wondered about this before: Why does nobody treat test reports as build artifacts? Let Make (or whatever) figure out the dependencies and incremental creation.

                                                                      1. 2

                                                                        Sometimes you have multiple build systems. For example, let’s say I have a repo with two independent dirs - one containing Javascript (npm builds) and one containing Android (gradle builds). Both build incrementally fine on my machine but on a CI, if I am only modifying the Android code then it is a waste to build and test the Javascript dir. Incremental creation does not work since the past artifacts are missing. And they are intentionally missing to ensure that the builds are clean and reusable.

                                                                        I have actually seen a bug where keeping the past artifacts created a subtle bug which was removed after we removed persistent caching from Circle CI.

                                                                        1. 2

                                                                          Some build systems, i.e. Bazel do it (it’s called “caching”, the same as saving build artifacts). This build system is especially designed for monorepos. Probably Buck, a build system with similar origins, does this too.

                                                                          However, writing tests for this behavior can be tricky, as it requires “hermeticity”: tests can only read data from their direct dependencies. Otherwise, “green” build may become cached and stay green in subsequent runs, where it will become red if cache is cleared.

                                                                          Sadly, it’s quite hard to use Bazel for js/ruby/python and similar, it does not have builtin rules for ecosystems of these languages, and for shell-based general rule you have to know what files your shell command will output before it runs (directories can’t be output of rules).

                                                                          1. 2

                                                                            My inspiration in some form came from both Bazel (which I used inside Google) and Buck (which I used at Facebook). Both are great tools. Setting them up and training the whole team to use them , however, is a time-consuming effort.

                                                                            1. 2

                                                                              it requires “hermeticity”: tests can only read data from their direct dependencies.

                                                                              Nix is able to guarantee this since it heavily sandboxes builds, only allowing access to declared dependencies and the build directory. I haven’t seen anyone exploiting this for CI yet but it might be worth playing with.

                                                                              1. 1

                                                                                How long does it take to setup Nix?

                                                                                1. 1

                                                                                  I’m not really sure how to answer that. Learning nix definitely takes a while and the documentation isn’t great. Writing a hello world build script takes seconds. Setting up some really complicated system probably takes longer ¯_(ツ)_/¯

                                                                                  I guess I can at least point at some examples:

                                                                                  1. 1

                                                                                    Thanks. After reading through the links, I am happy with my setup which returns 99% of the benefit without making every developer learn to write a new build system.

                                                                          1. 1

                                                                            Go Lang is great for concurrency, IMHO. Rust might be better but it is still somewhat unstable.