I’d suggest that auto-update is asking too much–have a functionality to check if updates are available, but leave the sysadmin work to the users and the ecosystem managers. If you do that, then you can just throw up a pile of binaries under Github releases or whatever.
So is it ok for the binary to phone home regularly to check for update? I thought most people would be uncomfortable with that.
Oh, no no no, not unless enabled explicitly. I mean, have a --check-updates
argument or whatever so I as a user can script it or check myself.
I’m not planning to phone home.
I want to give users an option to easily upgrade e.g. brew upgrade
or pip upgrade
.
i do this for some internal work tooling… but it doesn’t “phone home” it just hits an API for where my binaries are uploaded…
if you have releases in github, take the example of casey’s excellent just
:
❯ curl -L -s https://api.github.com/repos/casey/just/releases\?page\=1\&per_page\=1 | jq '.[0].name' -r
1.13.0
in my tool i take this output and compare it to the tool’s current version and output a message if it’s older (“hey, a new update is available”)
of course i fail gracefully and quickly if connectivity isn’t there (short timeout)
i wouldn’t call that “phoning home”
i wouldn’t call that “phoning home”
Users would - it’s still giving you an idea of how many people are using it and from where (source IP address), and you could start shipping extra information in the URL if and when you please. But if it’s just for work, who cares.
Triggering steps based on manually specified globs makes me wince a bit, but it’s unclear what people should do instead. Bazel is the “proper” solution but it’s cruel to recommend that to someone who does not have a build team to look after it. Even once you have Bazel, you need more plumbing to get conditional CI jobs.
Is there a good middle ground out there?
This is why Earthly was created, to give a more accessible version of proper caching, isolation and build parallelism but in a more approachable manner than Bazel.
If you are working in Python, Pants is also worth a look.
Too hard to compete with GitHub Actions Marketplace at this time. I don’t want to rewrite custom plugins myself.
Triggering steps based on manually specified globs makes me wince a bit, Why? Should Python linter run when the Go code changes? Or should the TS linter run when the Rust code changes?
Because it’s something the build tool can calculate for you. Anything worth its salt can tell you exactly which files it used once you have a complete product, so the sensitivity list could be generated automatically. This is a classic strategy for traditional Makefiles when compiling C.
As a user, I don’t want auto-updates. I’d rather the package manager did it or I did it manually. For most command line tools I’m happy to manually download binaries from the project website or github releases if my package manager doesn’t have an acceptable version.
For getting into package repos, just use some language with a clear standard toolchain for packaging binaries, use that toolchain and politely email the repo maintainers. Rust and Go have obvious toolchains that produce static binaries. If you use unusual dependencies then it could still be a pain to get into package repos because the maintainers will often want to package up the libraries separately from your tool (which is a bit shit, because then you don’t get control over what exact version your dependencies are).
Edit: if your tool requires regular updates to continue to work (e.g. it depends on some internet service whose API changes regularly) then a warning that I should update when I run it is fine if it has to connect to the internet anyway. Ideally, the tool shouldn’t talk to your server unless it has to.
Just to be clear, I don’t want magical auto-updates either.
I don’t want to push auto-updates.
What I meant was the ability for users to update via their package manager e.g. brew upgrade
The question really boils down to a separation of concerns. In most non-Windows ecosystems (including Homebrew), author and packager are separate roles. The packager knows the details of the target platform and the packaging system, the author knows the details of the program. They might be the same person. More commonly, they are the same person for some platforms, but not others. As an author, you can make packagers happy by doing a few things:
Makes sense. One cross-platform option that’s a bit of a hack is to distribute your binaries on npm: https://blog.xendit.engineer/how-we-repurposed-npm-to-publish-and-distribute-our-go-binaries-for-internal-cli-23981b80911b
It’s very convenient if your audience is likely to have npm or yarn or whatever installed.
With regard to option 2, getting your package into some package repos for a few major distributions does give you roughly this (modulo the limitations of each distro). I (at least notionally) try to target Gentoo, Debian, Ubuntu (specifically, though if you get into Debian properly then you will eventually be ported into Ubuntu), and Arch Linux when I’m not developing something tied to some language or environment’s package manager. But you might add one each of the Mac and Windows package management tools as well.
Edit: doing this (and also specifically structuring your program so that it is easily packageable), and providing .deb’s for direct download, means that distro maintainers for other distros will tend to be pretty willing to do the ‘last mile’ packaging, getting you fairly complete coverage.
Let’s assume I have a way of generating arch-specific static binaries
Any easy way to automate publishing the packages into
cargo-dist might be able to do that in future.
So, usually you would work out how to package the project for all of these as part of your CI (rather than ‘just generating a static binary’), and then on a tag release automatically push the generated artifact to the relevant package manager. Eg I’ve seen people use GitHub Actions to wrap the static binary in the wrapper that Chocolatey needs and then push it to Chocolatey.
But the exact ‘how’ depends on the details of your whole thing. Eg for packaging Rust things for Debian it is actually a lot easier than that, you typically wouldn’t compile a static binary, you only need a debian
directory in the root of your Rust project with a correctly formatted copyright
file and a debcargo.toml
file, which are processed automatically and compiled and distributed by the Debian infrastructure once you have registered your package with the Debian rust packaging team. Similar for Gentoo except you need a gentoo
directory with an ebuild file, and distributing binary packages requires a bit more setup on your end instead of being completely automatic on the distro infrastructure end.
Basically, you do need to learn a bit about ‘the maintainer life’ across the major platforms you want to release on, but the upside is that you get those nice ‘native OS ergonomics’.
All of the suggested fixes come with big caveats:
if: always()
jobs on the floor, possibly leaving you with a mess which needs to be cleaned up manually. Tearing down a few thousand stale S3 buckets gets pretty tedious, even with scripting.Tearing down a few thousand stale S3 buckets gets pretty tedious, even with scripting.
Yeah. Should most tests be writing to outside state e.g. S3 buckets? I don’t think so. As always, I never said that these rules are to be followed blindly by anyone.
Path filtering is harder than the article suggests. You need to remember to run the linter on all the files when the configuration changes or you upgrade the linter, for example.
I agree that it is non-trivial. I disagree that it is hard. Here’s a sample for a Python repo of mine. The efficiency gains are worth it. I have seen repos where Python tests run every time someone changes the README!
Putting a limit at 2x the average will probably break a small but significant number of otherwise passing jobs. Yeah, it depends on a lot of cases. 2X is my rule of thumb. In fact, if the tests are slowing up over the lifetime of the project, that’s worth investigating on its own, so, it is actually better for tests to fail and for someone to ask “Hey, how did these 30 mins tests now take over 1 hour?”
There are many criticisms I would level at GHA, but not discarding the entire build environment between runs is not one of them. Preserving random bits of the build user’s home directory between runs (“dependency caching”) may speed up some CI jobs, but it will also mean you don’t catch certain errors and your job output necessarily depends on what jobs have run before in some way. It’s not a safe default everybody can just switch on without understanding what it means.
It’s also a feat of mental gymnastics to suggest that no timeout is a poor default for new jobs, and then suggest that the way to come up with a good timeout is to use the average runtime of past runs of the job you’re creating!
I think the right way of doing dependency caching is with a container. This guarantees a base environment that has only the dependencies that you expect. For GitHub, you can use the same container as the base layer of a dev container (built tools in the CI container, any additional user-facing tooling in the dev container), so your contributors can have precisely that environment for Code Spaces.
Most languages have become good at not giving access to dependencies that you didn’t explicitly install.
You are right that dependency caching can lead to bugs and if that’s a bigger concern than developer velocity it is best to not enable it.
No default timeouts is indeed wrong. A few badly running jobs consumed all my minutes! And that’s how I learned about it.
i agree while heartedly about dependency caching, everytime I’ve added it to a CI service that wasn’t Elixir+hex I’ve regretted it.
Before ripgrep there was ack. It is fast and written in Perl. It never had the same hype as ripgrep, but was always a “better” grep.
Right; I’ve never once in my life written a program where the difference in speed between ack and ripgrep would have been noticeable by the end user.
There’s plenty of correctness-based reasons to prefer Rust over Perl, but for an I/O-bound program, the speed argument is very rarely relevant
The difference of 8s is easily noticeable:
$ time sh -c 'rg test linux-6.2.8 | wc -l'
113187
________________________________________________________
Executed in 13,69 secs fish external
usr time 1,61 secs 781,00 micros 1,61 secs
sys time 4,86 secs 218,00 micros 4,86 secs
$ time sh -c 'ack test linux-6.2.8 | wc -l'
113429
________________________________________________________
Executed in 21,07 secs fish external
usr time 12,82 secs 1,13 millis 12,81 secs
sys time 4,25 secs 0,00 millis 4,25 secs
I dunno… I abuse ripgrep enough in enough various situations that I would probably notice if it got slower. But that also might lead to me being more careful about my workflow, instead of shoving gigabytes of data into it and saying “lol cpu go brrrr”.
I search on GitHub. Not the best technique but better than Google search in terms of cutting down all the listicles that make it harder to find the actual tool or library.
Like it or hate it, the reason users like Google’s AMP is because the normal web has become slow and bulky to use.
AMP is a cancer on the modern web that spreads to everyone who copy-pastes AMP URLs. It’s mostly downside from both a user experience and technical perspective.
Downsides to normal users:
Downsides to the web:
Upsides:
I’ve frequently thought about writing a little tool for integration into some chatbots to remove everything extraneous from pasted URLs, starting with AMP and possibly including garbage like the fbclid
parameter for Facebook tracking, Google Analytics spyware query params, the new Chrome deep-linking URL fragments, imgur trying to serve you something other than an image, and so on. Unfortunately, this trend of bait-and-switch on URLs to serve more ads has become depressingly common. If anyone can point me to something that already does this, I’d love to hear about it.
Edit: Found ClearURLs. Perfect.
Users don’t “like” AMP. Users like fast, snappy content–it is the basest trick of marketing and developer evangelism that has conflated the two in an attempt to further fill the GOOG moat.
On my wishlist: A way to block all the bloody “Subscribe to my spiffy mailinglist”-popups that has infested the web.
Big same. I was working on a browser plugin to turn position:fixed
/etc elements into display:none
, but it ran into a wall of
I suspect dealing with it robustly would require hacking up the browser renderer itself.
The No, Thanks extension gets rid of some of them. Enough that I’m willing to pay its subscription fee because those stupid things make my blood boil, but it still misses a bunch.
The unfortunate reality is that they work. I remember reading, I think, Andrew Chen (A16Z) who mentioned that he feels sorry for these popups but he has to keep them on his blog since they work.
Andrew Chen doesn’t have to have these annoying popups on his blog, he could perfectly well choose to have a button or a link. Truth is that he chose the annoying popups because he values the number of subscriptions more than the wellbeing of his audience.
Do you have the source / data for the that? I’m not even sure how you’d measure how well they work. I assume you’d have to do some A/B testing, but while you can measure the number of people who sign up for your newsletter, and possibly even track whether the emails cause them to come back to your blog, you can’t measure the people who are unimpressed or get annoyed and don’t come back or recommend your blog to others.
To be that person, I have no trust in a project that explicitly calls out the “nonsense that is the urbit project” without at all mentioning the nonsense that is it’s ideological foundation in far right reactionary thought.
far right reactionary thought
Wouldn’t reactionary thought eschew solutions using technology at their core? I think they identify as neoreactionary for this reason…
This explains why I am getting a deluge of small low quality PRs on one of my Open source repo. The repo isn’t even code but just a list of things.
No one likes YAML but it survives. I am definitely amused at such cryptic languages which thrives despite being kludgy.
No one likes YAML but it survives. I am definitely amused at such cryptic languages which thrives despite being kludgy.
“Good things come to an end, bad things have to be stopped.” ― Kim Newman
You are storing flag values somewhere. If you are not using git commit to turn them on then you must have a different full-fledged system to record when the flag was modified, by whom, and who approved it.
Please make sure that it is three-step approach, you need to go back and remove the old branches eventually. As long as the other branches exist they complicate future changes.
Good point. You are right, in some cases, it is three-step. While writing this I implicitly assumed the deletion of old code.
There is only one GNU/Linux distro that was remotely comparable to the Mac OS user experience. Alas. No more.
This scripting seems to replicate features of a build system. I wondered about this before: Why does nobody treat test reports as build artifacts? Let Make (or whatever) figure out the dependencies and incremental creation.
Sometimes you have multiple build systems. For example, let’s say I have a repo with two independent dirs - one containing Javascript (npm builds) and one containing Android (gradle builds). Both build incrementally fine on my machine but on a CI, if I am only modifying the Android code then it is a waste to build and test the Javascript dir. Incremental creation does not work since the past artifacts are missing. And they are intentionally missing to ensure that the builds are clean and reusable.
I have actually seen a bug where keeping the past artifacts created a subtle bug which was removed after we removed persistent caching from Circle CI.
Some build systems, i.e. Bazel do it (it’s called “caching”, the same as saving build artifacts). This build system is especially designed for monorepos. Probably Buck, a build system with similar origins, does this too.
However, writing tests for this behavior can be tricky, as it requires “hermeticity”: tests can only read data from their direct dependencies. Otherwise, “green” build may become cached and stay green in subsequent runs, where it will become red if cache is cleared.
Sadly, it’s quite hard to use Bazel for js/ruby/python and similar, it does not have builtin rules for ecosystems of these languages, and for shell-based general rule you have to know what files your shell command will output before it runs (directories can’t be output of rules).
My inspiration in some form came from both Bazel (which I used inside Google) and Buck (which I used at Facebook). Both are great tools. Setting them up and training the whole team to use them , however, is a time-consuming effort.
it requires “hermeticity”: tests can only read data from their direct dependencies.
Nix is able to guarantee this since it heavily sandboxes builds, only allowing access to declared dependencies and the build directory. I haven’t seen anyone exploiting this for CI yet but it might be worth playing with.
I’m not really sure how to answer that. Learning nix definitely takes a while and the documentation isn’t great. Writing a hello world build script takes seconds. Setting up some really complicated system probably takes longer ¯_(ツ)_/¯
I guess I can at least point at some examples:
Thanks. After reading through the links, I am happy with my setup which returns 99% of the benefit without making every developer learn to write a new build system.
A docker-format container does not (always) require docker desktop to be running. On a Linux host, you can use Podman which is daemonless, and supports user-invoked (unprivileged) containers too. Yes via docker desktop you get windows and mac coverage. Container UX is clunky for CLI apps, IMHO.
Padman is better but it is still less popular than Docker Desktop. Further, Podman requires a VM engine to be installed afaik.