Even though it seems as though operating system package managers are almost falling out of favor in exchange for a proliferation of language-specific package managers—what can software authors do to make it easier for people package their software?
Some aspects that I’ve been thinking about:
- What can be done to minimize friction with building and installation?
- Is a
curl | bash
pattern okay enough for packagers or should it be expected that they will always try to build things the hard way from source?
- Are there entire languages or ecosystems to avoid for software that desires to be packaged?
- What’s the most recent dependencies and language version may be or should authors aim to use the latest versions of everything have the packagers sort it out?
- Are there projects that serve as positive or negative examples that could be referenced?
- What is some information that would be useful to have for the people who backport bugfixes and security vulnerability fixes to older, packaged versions of the software?
For C/C++ in particular:
- Is autotools still the way to go to provide a “uniform” build environment?
- Are plain Makefiles okay if they respond to
DESTDIR
and PREFIX
?
- Is it reasonable to use other build systems that include build-time dependencies beyond make (e.g. scons, CMake)?
First off, thanks for asking! One of the most demotivating things is when upstreams don’t care!
I am speaking form the perspective of an OpenBSD port maintainer, so everything will have a fairly OpenBSD centric slant.
Responses to your questions:Use one build tool. Know it well. Often times the build tool has already solved problems you are attempting to solve. For example:
The other end of “not using one tool” is reaching for other build tools from inside an existing one. This just makes things suck as packagers have untangle a heap of things to make the program work.
Always build from source. I call these scripts “icebergs” as they often look small until you glance under the surface. 90% of the time things that “install” via this method use
#!/bin/bash
which is absolutely not portable. Then, typically, the script will pull down all the dependencies needed to build $tool (this usually involves LLVM!!).. then the script will attempt to build said deps (99.9% of the time without success) with what ever build tool/wrapper the author has decided to use. Meanwhile, all the dependencies are available as packages (most of the time with system specific patches to make them build), and they could have simply been installed using the OSs package manager.NPM. We tried to make packages out of npm early on and ultimately hit version gridlock. There are still things we package that build npm things (firefox, chromium, kibana) - but they are shipping a pre-populated
node_modules
directory.I have linked a few examples above, but if you want many, dig down to the
patches
directory for any port in the OpenBSD ports tree, you can see exactly what we deal with when porting various things.My goal with the links is not to say “look how stupid these guys are!”, it’s simply to point out that we could all stand to know our tooling / target users a bit better.
In generalThe best thing you can do is understand the tools you are using. Know their limits. Know when a tool is the right one for a given job.
Also don’t think of your build environment as something that end users will have to deal with. Doing so creates a false sense of “I must make this easy to build!” and results in non-portable things like
curl | bash
icebergs. The vast majority of the time it will be package maintainers who are working with it.For fun I made a quick and dirty list of ports and their respective patch count, here are the top 4:
I agree with nearly everything you said here, but there is one thing I would like to push back on:
I went through the very painful process of packaging my own upstream software for Debian and ran into this rule about having to package all the dependencies, recursively. I don’t think a policy that mandates NPM packages must all be separate system packages is reasonable. I ended up rewriting most of my own dependencies, even creating browserify-lite, because reimplementing Browserify from scratch with no dependencies was easier than packaging them. Mind you this is a mere build dependency.
On top this, some JS dependencies are small, and that’s fine. But Debian ftp masters didn’t accept dependencies that were smaller than 100 lines, while also not accepting pre-bundling node_modules.
The policy is outdated. I’ve heard all the arguments, I’m familiar with the DFSG, but I think it’s too strict with regards to static dependencies. Static dependencies are sometimes quite appropriate. In, particular, just bundle the node_modules folder. It’s fine. If there’s a package in particular that should be system-wide, or you need patches for the system, go for it. Make that package a system package. Allow multiple versions to solve version lock.
It’s not. If it’s just a “blob” basically, how do you then know when one of those modules is in need of a security upgrade? The entire concept of packaging is based on knowing which software is on your system, and to avoid having multiple copies of various dependencies scattered around.
The real problem is the proliferation of dependencies and the break-neck speed at which the NPM ecosystem is developed.
This is the most common argument I hear. The answer is simple: the security upgrade is upstream’s problem.
The distribution / package manager must accept this fact. Upstream could, for example, have a security issue that is in its application code, not any of its dependencies. Due to this fact, distros already must track upstream releases in order to stay on top of security upgrades. It is then, in turn, upstream’s job to stay on top of security upgrades in its own dependencies. If upstream depends on FooDep 1.0.0, and FooDep releases 1.0.1 security update, it is then upstream’s job to upgrade and then make a new upstream security fix release. Once this is done, the distro will already be tracking upstream for security releases, and pull this fix in.
I don’t buy this security upgrade argument. Static dependencies are fine.
The real problem is how productive the community is and how successful they are at code reuse? No. The real problem is outdated package management policies.
That’s insane. First of all, CVEs are filed against specific pieces of software. For truly critical bugs, distribution maintainers often coordinate disclosure in such a way that the bugs are fixed before details are released into the wild. This allows users to patch before being at too much risk.
Expecting that all upstreams which include said package statically will update it in a timely manner means that those upstream packages need to be conscientiously maintained. This is a pipe dream; there are lots and lots of packages which are unmaintained or maintained in a haphazard way.
As a user, this is exactly what you don’t want: if there’s a critical security update, I want to be able to do “list packages”, check if I’m running a patched one, and if not, a simple “update packages” should pull in the fix. I don’t necessarily know exactly which programs are using exactly which dependencies. Using a package manager means I don’t have to (that’s the entire point of package managers).
So, if some packages on my system included their dependencies statically, I would be at risk without even knowing it.
So instead of updating 1 package, this requires the (already overworked) package maintainers to update 50 packages. I don’t think this is progress.
I agree, individual packaging of recursive deps is a nightmare.
The one thing that makes it a “feature” on OpenBSD is the forced use of ftp(1), which has extra security measures like pledge(2). I trust ftp(1) much more than npm/node/other tools when it comes to downloading things.
One super down side to the “use ftp to fetch” method is that unless the tools in question (npm, go get, gem) expose the dependency resolution in a way that can be used by the ports framework/ftp.. you are basically stuck re-implementing dependency resolution, which is stupid.
I didn’t know about the100 line limit. That’s interesting! I can see it being a pain in the butt, but I also appreciate the “minimize dependencies” approach!
This is all excellent advice. I would also add that commercial and other financially backed publishers can’t outsource this, and can’t shortchange this. It’s important to have a distro liaison if you care about the end user experience. If you don’t have this you end up in the GNOME or Skype or NPM or Docker situations. Invest in testing the packages. Monitor the bug reports. Stay ahead of changes on the dependency tree. And why not, sponsor the bug bash and the BoF and all of that.
I’m going to address the title directly:
Step one is actually build your software on more than Linux. Shake out the Linuxisms by building on the BSDs. This will uncover often trivial but common issues like:
make
Is GNU make/bin/sh
is `bash’/usr/bin
vs./usr/local/bin
I wrote a crypto library. Single file, zero dependency. Not even libc. I do need to perform some tests, though. And even for such a simple thing, not assuming GNU make is just too hard. How come the following does not even work?
If I recall correctly,
$^
is not available on good old obsolete pmake. So what do I do, put every single dependency list in a variable?Another little hassle:
Depending on that pre-processor flag, I need to include the optional code… or not. Is there any “portable” way to do so?
Sure, assuming GNU everywhere is not a good idea. But some GNU extensions are just such a no brainer. Why don’t BSD tools adopt some of them? Why not even
$^
?Feel free to assume use GNU make (IMO), just make sure your sub-invocations are $(MAKE) instead of “make”.
That could work indeed. My current makefiles aren’t recursive, but I’ll keep that in mind, thanks.
Same with
bash
instead ofsh
: it’s fine to use bash, just be conscious about it (“I want bash feature X”) and use#!/usr/bin/env bash
. The problem is assuming/bin/sh
is/bin/bash
, which can lead to some rather confusing errors.So my suggestion wasn’t to not use GNU make it was to not assume
make
== GNU make. What this means is if building requires GNU make then call that out explicitly in build instructions. This makes it easier for packagers to know they need to add a dependency ongmake
.If there are scripts or other automation that call
make
then allow the name of the command to easily be overridden or check forgmake
and use that if found rather than callingmake
directly and assuming it’s gnu make.Why would any generic userland application want to build their software beyond Linux environments?! Eg, is Slack or Skype actively doing this? If anything I would assume attempting to assure my application builds against legacy Linux build tools (and yes even assuming GNU make) is a good thing…
To ask the question another, what segment of my user base is BSD based? I suppose you’re answering wrt the BSD adoption portion of the parent question. I guess my own comment is that unless ones software application is massively popular all the genericity considerations of the build tooling you’ve described sounds like massive overkill.
I think if you take this argument one step further, you end up building only for Windows. That’s not a fun world. Would we then just invest a bunch of effort into Wine? It’s what we used to do.
Portability allows for using your favorite platform. It’s something we all have valued from time to time.
If you make the right choices, you can develop very portable software these days in most languages. So, the way I read it, learning how to make those choices is what the OP is suggesting.
I expect it is disallowed in a lot of package managers. for pkgsrc, we would like to keep a local copy of all the files used in a build, and run a checksum.
If you want to do this anyway, make sure it’s possible to hide with a tunable, say –dont-fetch-dependencies.
autotools, meson, cmake, etc. are all fine. If a mainstream package uses it, then it’s guaranteed to be easy to use.
If you don’t know of any feature configure checks you need, go for it! You can always add the complexity later if it turns out to be necessary.
Additional good items:
Standard license
Build instructions are great, in a README or a BUILDING file. Ideally, list dependencies.
Simple “smoke test” to see if the package works. I often interact with packages I don’t use and want to know if they still work after my changes.
Version number always increases. If you don’t have a version number yet, 0.0.20191020 is viable.
Ideally, don’t require a specific version of a dependency, but “newer than a cutoff”.
As a distro maintainer I can tell you the most important thing at all is the less weird your thing is the more likely it is going to be packaged.
Autotools? I don’t like it, but I know how to handle this. CMake? Yeah, that too. A build system I have never heard of? No. Just no. Never do that.
Plain Makefiles are okay if you stick to common practices. I.e. make CC/LD/CFLAGS/LDFLAGS/PREFIX/DESTDIR/etc. should all work.
As a user of packaging mostly, I’d like authors to include manpages to help packagers integrate their software to their OS.
What would this manpage say? What should the publisher say in that file that packagers should do?
Limit the number of dependencies (and dependencies of dependencies) of your software as much as possible. If you use dependencies, make sure they are well supported, or be willing, and able! to take over their maintenance when needed…
Check what is already packaged i.e. language(s), frameworks, dependencies, in your favorite (LTS) OS(es) and use those (versions), e.g. check what CentOS and Debian/Ubuntu are doing and target those specifically. It is not always needed to target the latest and greatest…
Make it possible to run your included tests with system libraries instead of your (bundled) copies as well, the packager can then make sure everything works when using the already packaged dependencies and run the test during the build stage, e.g.
%check
in RPM spec files…One good way to check that is to use Repology.
I’d say Debian is one of the most demanding distros for packaging. Judging from your questions the most jarring one: No internet access allowed. All dependencies must come from other Debian packages.
In general, don’t use anything fancy. If the good old configure-make-make-install process works then great. Established older build systems like CMake or Scons should be fine. Bazel is not packaged for Debian yet, so that would be a problem.
Every sane distro will have that requirement. Fetching dependencies during build time makes you dependend on external services and makes the process reliably non-reproducible.
I find this sort of phrasing so weird. So what, Archlinux is an insane distro? Arch build scripts for Rust, for example, will just invoke Cargo, which will download any necessary crates for building.
This has its downsides, but it hardly makes it “insane.” The advantage is that packaging Rust applications (and other applications in other languages with similar packaging tools) becomes much simpler.
I can auto-generate Rust packages without the need to fetch from the internet. We extract it, read the Cargo.lock file to determine the dependencies, create .cargo/config to point to our own vendored copies, and run cargo with the –frozen flag.
Sure, right. I imagine Debian does something similarish. My point was that Arch, as far as I know, doesn’t require builds to not talk to the Internet. Rust was just an example.
In my package manager I just to use cargo vendor, and store a hash of the expected vendor directories, I think it’s a good compromise tbh. Its becoming quite difficult to go against what language package managers expect, but features like cargo vendor make it at least reasonable.
if this is the case it is a mistake and a trend we should work to reverse.
To help OS maintainers, keep the software as simple as possible, avoid CMake and autotools if possible, write very simple Makefiles.
I totally agree, especially in regard to Makefiles, and am glad to see that you linked one of ours as an example. We only write Makefiles for all suckless tools and stay away from CMake, autohell or other solutions. This simplicity makes it easy to package software (see Gentoo, Alpine and NixOS as examples).
Granted, we keep the scope of our software small, but maybe we should generally question the tendency nowadays to build massive behemoths with tons of dependencies and complex build logic. If you think about it, the reasons why things like autohell exist are not present anymore. 10-20 years ago, the ecosystem was much more diverse, but nowadays, every time I see a configure-script check if my compiler supports trigonometric functions or something, I just shake my head. 99% of configure scripts are copy-pasted from GNU code and they are a huge waste of time and energy. To make matters worse, these configure-scripts effectively prevent me from easily compiling such software on a RaspberryPI (or similar) as it takes so long to run these configure-scripts every time.
In contrast to this, a Makefile takes mere seconds, and if you run a more “nonstandard” system, you just change config.mk (example taken from farbfeld) to your needs, but this is not necessary in 99% of the cases.
To make it short, @xorhash: Keep it simple! :)
This is madness if you intend to support anything other than Linux.
All suckless programs are simple enough that any experienced Unix user should be able to compile them without any Makefile at all. Most can be compiled by just
cc -lX11 -lXft -o dwm *.c
, or something along those lines.It’s probably not a good solution for, say, Firefox or KDE, but not all software is Firefox or KDE.
The makefile I linked is cross-platform.
I disagree, the only platform that can’t deal with a simple makefile in my experience is Windows.
Why is this a mistake?
Most language package managers are poorly designed, poorly implemented and modify the system in ways that are not deterministic, not reproducible and cannot be reversed. I’m thinking of pip in particular, but I believe npm suffers from similar issues as well. If we want something inbetween OS package managers and language package managers then probably something like Nix is required.
I don’t think it’s a mistake. Only 1% of open source components are packaged on a distro like Debian. Best chance to have a packaged ecosystem is CPAN and that hovers between 10-15%. Distros aren’t in that business anymore. What OS package managers should do is deliver native packaged and be good stewards of the rest of the software in the system. For example, making apt “npm aware” so that apt can singly emit a package install operation even if it was triggered by npm locally.
It doesn’t take much for simple Makefiles to not scale. Try adding a dependency on something as basic as iconv or curses while remaining portable. Autoconf is not as bad as all that and is common enough that any OS packager will know how to deal with it. I’m rather less fond of libtool, though.
I maintain a project that has iconv as a dependency, the Makefile supports Linux and the BSDs, trivially.
edit: disambiguate
For those of us writing Rust, is a plain Cargo package okay?
Yes! For FreeBSD, at least - we can package these very very easily, the Ports framework has a command for adding all the crates from the cargo lockfile :)
Though if you want to go beyond just an executable and ship a whole desktop app with resources and stuff, check out what Fractal does with Meson and
cargo vendor
.That’s awesome!
I just released a potentially useful tool at https://git.nora.codes/nora/utf8-norm ; where should I go to get it looked at for inclusion?
Also, thanks for the info re: desktop apps. I’ll look into that for gDiceRoller.
Someone who’s interested in writing and maintaining the port needs to post it to bugzilla. As for how you get someone interested if you’re not going to be the maintainer yourself… I’m not aware of a formal place for “someone please package this” requests. Usually maintainers find interesting stuff themselves (by browsing this website or the orange one, for example).
Totally makes sense. If I ever manage to actually get a daily driver FreeBSD system going, I’ll definitely submit and maintain some stuff. Until then I don’t feel super good about committing to maintain ports I can’t test in daily use.
As a user of software, if distro packages aren’t available, I’m happy if I can find a docker image or flatpak.
We always want to build from source. Posting checksums of your tarballs is helpful too. Do NOT rely on Github’s automatic tagged release stuff. Those tarballs are generated on demand and change.
The build process must not require internet access to build your software from source. All deps should be easily acquired before build time. This is a security issue and a reproducibility issue.
Project lead and principal packager for Adélie Linux here. I would like to start by saying: thank you for actually considering us (software packagers)! It’s a thankless job, but we’re on the front lines, porting your software to our environments (which may include CPU architectures, endians, libcs, etc that you’ve never even considered or heard of). So, I’d start with:
Understand that packagers are people too, and a lot of us are also programmers. For the vast majority of us, when we report bugs or issues, we are trying to help you and we care enough about your software that we are taking time out of our list of 500 updates to try and make it easier for the next people. Don’t blindly dismiss us just because we’re trying to do distribution packaging.
That’s not the hard way. The
curl | bash
way is the hard way.curl | bash
already implies the user has bash (our base platform image uses zsh as the login shell and dash as /bin/sh). That method either requires you to assume software is present on the system (it may not be), or it means you will build it all yourself. Consider the LLVM argument made in another comment: we had to patch LLVM to make it work on the musl libc on POWER. If we then build your software, and it tries to build “raw” LLVM, we’re going to have patch your LLVM copy too.Rust is hard to package because of the way Cargo works. Rust itself is a good language and I hope some day that situation will improve. Currently it’s not even possible to use Cargo for most without patching.
Also, NPM is a headache.
Somewhere in the middle of “what is supported right now”. In Adélie, we have Python 3.6, Ruby 2.5. If you really need something from the latest version, we’re probably not forced to ship your software until we have the new language version.
Oh, I really don’t want to name and shame. Most packages are at least somewhat sane and can be coerced into building properly even on our environments. The negative examples I could think of would be Google products – libvpx is a horror show, and Chromium… well, they’re not even packagable.
The most useful is having a branch of your last stable version and apply security fixes to that branch as well. Not everyone has the time or patience to do that, so at least:
It doesn’t have to be. CMake works well. Meson is a bit annoying at times when people use it improperly, but can also work well.
Of course! Use the simplest tool that works for you. That helps not just us, but your own maintenance (less to worry about).
scons is a bit of a pain, but most of the others are fine.
Any examples of improper use? I really can’t think of a Meson-specific way to misuse the build system, Meson is designed to avoid that as much as possible :)
GLib doesn’t set ‘install: true’ on their static libraries causing link failures on downstream apps that require GLib charset stuff because they have no idea how to conditionalise it based on whether the target is glibc or not.