There is also a German talk about package managers and distri. I’ve submitted this two months ago (only accessible if you are logged in), but asked to take it down again.
I’ve just tried out the Docker image and this whole project looks really promising (though I’d called it distri add and not install ^^).
New innovation in Linux package management is super needed. NixOS is cool, but what distri is addressing goes even further. I think the question “Can we do without hooks and triggers” is a really important question to ask, I think this is a huge issue (for example) Debian mindset still has - everything needs to somehow be glued together in various ways…
Do I understand right, distri is mounting squashfs via. fuse? Are there any security issues, guarantees missing in comparison to regular kernel-space file systems? My FUSE security know how is only limited…
I’ve read that Linux namespaces is getting fuse support, will this mean we could create distri based Docker images?
I want to do an English talk at some point, too, and will definitely share the recording.
Do I understand right, distri is mounting squashfs via. fuse?
Correct!
Are there any security issues, guarantees missing in comparison to regular kernel-space file systems? My FUSE security know how is only limited…
If anything, I would say there is less attack surface when running the SquashFS driver in user space: if a malicious image is used (e.g. from a third-party mirror that an attacker convinced you to use), at worst you’ll need to reboot your system when the FUSE daemon crashes (we should look into restarting it when crashing, but it crashes rarely thus far).
I’ve read that Linux namespaces is getting fuse support, will this mean we could create distri based Docker images?
Interesting! Can you share a link please? I haven’t heard of this yet.
Well I guess this is an entirely different story for read-only SquashFS images.
I can’t find it anywhere, I think it was about that you don’t have certain guarantees in FUSE (like multiple users on one file-system). But I guess the exact use-case doesn’t fall into many issues people usually have with FUSE (higher latency?). I’m also wondering how it will behave in low memory conditions.
Ah, yeah. distri currently uses unprivileged user namespaces (which need to be explicitly enabled on a number of distributions), so it already has permission to mount FUSE within the namespace. I don’t think we’ll gain anything from that change you reference.
like multiple users on one file-system
We’re using FUSE’s allow_others option so that all users can read from the file system. I think that’s what you mean.
This is quite similar in mechanisms used by Nix - immutable paths, path patching, storing at well known places via symlinks. Where would be the main differences?
There is indeed some overlap with Nix et al. (separate hierarchies key idea, and using a package store).
I have used NixOS for about half a year, and the user experience felt distinctly different than what was possible to achieve in distri (which I have been using for a few months on my laptop now).
Notable differences are package installation speed, how packages are built (declaratively specified, not in a functional language), and how well it works with third-party software.
Nice, thanks for such a prompt answer! The installation speed sounds pretty interesting! Did you do any scalability measurements, how many images could be supported under /ro? (Just curious if the approach could help nix, which has a lot under /nix/store)
I currently have 677 packages in my store on my laptop. There are only 425 different packages in distri, so there are some duplicates which I have not yet deleted.
The only slow-down I could find thus far is when the exchange directories are traversed when booting. This could be done lazily and/or cached, I just haven’t gotten around to it. I wanted to do the release first, and implementation details can always be polished later :)
Just to make sure I’m understanding you correctly, are you saying that Distri is sort of an unopinionated and more compromising version of NixOS?
Also just curious, aren’t you trading off package installation efficiency for efficiency during operation of the package? I have to imagine that it’s more expensive to access data through a FUSE program parsing squashfs images at runtime, than natively through the file system.
To productionize this idea, it would probably make sense to implement this file system as a kernel module, not via FUSE.
That said, in my day-to-day, programs are typically loaded into memory once, and I don’t notice that being slower than on the non-distri computers I use. So I don’t feel any pain to rush from user space to kernel space :)
While that is a component of my premise, user/kernel space slow down isn’t primarily to what I was referring. From a high-level perspective, it would seem to me that traversing squashfs images isn’t as efficient as, for example, the ext4 on-disk format.
If you haven’t noticed a slowdown maybe it’s not a cause for concern, though I could see FS competitive performance becoming a bigger issue that may block adoption when cold-booting larger programs like Firefox / LibreOffice, etc.
it would seem to me that traversing squashfs images isn’t as efficient as, for example, the ext4 on-disk format.
Why is that? I haven’t looked into ext4’s on-disk format as much as I have into SquashFS’s, but it seems pretty efficient to me.
If you haven’t noticed a slowdown maybe it’s not a cause for concern, though I could see FS competitive performance becoming a bigger issue that may block adoption when cold-booting larger programs like Firefox / LibreOffice, etc.
I’m using google-chrome regularly. Reading the few files it contains is not a big deal, even through my FUSE file system :)
it would seem to me that traversing squashfs images isn’t as efficient as, for example, the ext4 on-disk format.
Why is that? I haven’t looked into ext4’s on-disk format as much as I have into SquashFS’s, but it seems pretty efficient to me.
You would know more, but I see two situations:
squashfs on-disk format is meant to be mapped directly into memory, and traversed directly. In that case, I would imagine that there is a tradeoff between the size of the squashfs image file, and the efficiency of traversing it once it is in-memory. E.g. redundancy of data usually allows faster in-memory data structures but not good for image file size.
squashfs on-disk format requires some pre-processing before it can be loaded into memory and traversed. This is different from ext4, which basically requires no pre-processing before being traversed.
According to those two situations, squashfs is either not efficient disk-space wise, or not efficient performance-wise (relative to e.g. ext4). Of course, it’s possible that squashfs is not as efficient as ext4, but close enough that it doesn’t matter. As long as cold-boot performance is comparable (e.g. not more than 1-5% slower), it probably will never bubble up as an issue.
SquashFS is actually pretty flexible. By default, it optimizes for size, but my implementation optimizes for easy access. There are still some wins like producing SquashFS’s directory index data structures.
Just curious, any reason you didn’t choose to use a combination of overlayfs and squashfs, e.g. mounting squashfs images natively, and using overlayfs to create the exchange directories?
I actually implemented it like that before implementing my FUSE file system! Turns out that adding new kernel mounts gets really slow really quickly. One of my kernel developer friends told me that mounts are linked lists that degrade pretty quickly.
Setting up all the mountpoints required to build moderately sized programs took many seconds. With the FUSE file system that can lazily mount these images (and mounts them more quickly), this setup is now done in the fraction of a second, which is a massive developer experience improvement :)
Additionally, overlayfs were pretty complicated to correctly manage programmatically, especially changing the composition of overlays, like when installing a new package.
More than a little for me; I’m having trouble finding a feature that isn’t implemented by Haiku’s package manager. Not that that’s a bad thing; I’ve been wanting an equivalent for Linux ever since I’ve learned about it.
Something I have not seen mentioned elsewhere: Gobolinux. It doesn’t treat packages as mounted images, but it does have a philosophy of “program goes in one folder” and then “farm out symlinks into traditional paths”. I think they also have a kernel module + tool to hide paths (like /usr) from showing up in ls (of /).
Thanks for the pointer! A few people have mentioned Gobolinux on twitter, too. I had read about it many years ago, and there are certainly some similarities. I like it when projects reinforce each other like that.
Thanks also for your note of support. I appreciate it!
Yes, interesting observation! I thought everything ends up with apt, then Nix and now distri came along. Sometimes I wish to get something like this on FreeBSD.
You mention that when there are conflicts that the package with the highest distri revision will take precedence, but are there plans to allow globally selecting a particular package version as the ‘default’? e.g. some distros provide tools that basically symlink /usr/bin/gcc-7–>/usr/bin/gcc to set the default gcc to 7 even if gcc-8 were installed.
As an aside, I had no idea you wrote i3. Wow, thank you for that!!
but are there plans to allow globally selecting a particular package version as the ‘default’? e.g. some distros provide tools that basically symlink /usr/bin/gcc-7–>/usr/bin/gcc to set the default gcc to 7 even if gcc-8 were installed.
Sure, that’s even shorter. What I was hinting at with my approach is that you can designate a directory to hold and manage symlinks, which might be easier than to override $PATH ad-hoc when you need to.
Or is there a reason why you’d want gcc-8 and g++-7?
No, no specific reason other than I didn’t pay attention :)
Ok. Btw, this looks really great. I’m convinced packaging/modules isn’t even close to a solved problem, so I’m really excited to see exploration in this space.
Really interesting project. I would jump from Debian if/when this project is ready for daily use. The ability to install multiple versions of a package is crucial for experimentation and local package development. Atomic updates are another feature that would give me less anxiety before upgrading.
One question, how is this different/better than NixOS? Answered elsewhere, thanks!
Go is a slight turn off to me as well, but that objectively probably doesn’t mean much since I would prefer if it were written in C. Mostly for consistency with the rest of Linux infrastructure.
I disagree — while we have to retain C for backwards compatibility (probably for decades to come, maybe even a century), it’s well past time to use a language with fewer footguns. I’d prefer Lisp (of course), you prefer Rust, but honestly Go is a decent language in which one can get work done without excessive boilerplate and pain.
IMHO Go is probably the ideal language for tooling like this in the real world (as opposed to my ideal world, in which everything is a Lisp machine).
Welcome @stapelberg to lobste.rs! :-)
Happy you finally released it publicly.
There is also a German talk about package managers and distri. I’ve submitted this two months ago (only accessible if you are logged in), but asked to take it down again.
I’ve just tried out the Docker image and this whole project looks really promising (though I’d called it
distri add
and notinstall
^^).New innovation in Linux package management is super needed. NixOS is cool, but what distri is addressing goes even further. I think the question “Can we do without hooks and triggers” is a really important question to ask, I think this is a huge issue (for example) Debian mindset still has - everything needs to somehow be glued together in various ways…
Do I understand right, distri is mounting squashfs via. fuse? Are there any security issues, guarantees missing in comparison to regular kernel-space file systems? My FUSE security know how is only limited…
I’ve read that Linux namespaces is getting fuse support, will this mean we could create distri based Docker images?
Thanks!
I want to do an English talk at some point, too, and will definitely share the recording.
Correct!
If anything, I would say there is less attack surface when running the SquashFS driver in user space: if a malicious image is used (e.g. from a third-party mirror that an attacker convinced you to use), at worst you’ll need to reboot your system when the FUSE daemon crashes (we should look into restarting it when crashing, but it crashes rarely thus far).
Interesting! Can you share a link please? I haven’t heard of this yet.
Seconded, welcome aboard! I hope you stick around
Thanks! Thus far, I like the discussion; it seems very positive.
I found the GitHub issue again: https://github.com/docker/for-linux/issues/321#issuecomment-487955090
Seems it is not there, but I guess a big fundament is “FUSE Gets User NS Support Linux 4.18”.
Well I guess this is an entirely different story for read-only SquashFS images.
I can’t find it anywhere, I think it was about that you don’t have certain guarantees in FUSE (like multiple users on one file-system). But I guess the exact use-case doesn’t fall into many issues people usually have with FUSE (higher latency?). I’m also wondering how it will behave in low memory conditions.
PS: This paper looks really interesting
http://edoc.sub.uni-hamburg.de/informatik/volltexte/2015/210/pdf/bac_duwe.pdf
Ah, yeah. distri currently uses unprivileged user namespaces (which need to be explicitly enabled on a number of distributions), so it already has permission to mount FUSE within the namespace. I don’t think we’ll gain anything from that change you reference.
We’re using FUSE’s
allow_others
option so that all users can read from the file system. I think that’s what you mean.Yeah, thanks for the link! There’s also https://www.usenix.org/system/files/atc19-bijlani.pdf (I wrote about it in https://twitter.com/zekjur/status/1149582433072771078)
This is quite similar in mechanisms used by Nix - immutable paths, path patching, storing at well known places via symlinks. Where would be the main differences?
There is indeed some overlap with Nix et al. (separate hierarchies key idea, and using a package store).
I have used NixOS for about half a year, and the user experience felt distinctly different than what was possible to achieve in distri (which I have been using for a few months on my laptop now).
Notable differences are package installation speed, how packages are built (declaratively specified, not in a functional language), and how well it works with third-party software.
Does that answer your question?
Nice, thanks for such a prompt answer! The installation speed sounds pretty interesting! Did you do any scalability measurements, how many images could be supported under /ro? (Just curious if the approach could help nix, which has a lot under /nix/store)
I currently have 677 packages in my store on my laptop. There are only 425 different packages in distri, so there are some duplicates which I have not yet deleted.
The only slow-down I could find thus far is when the exchange directories are traversed when booting. This could be done lazily and/or cached, I just haven’t gotten around to it. I wanted to do the release first, and implementation details can always be polished later :)
Just to make sure I’m understanding you correctly, are you saying that Distri is sort of an unopinionated and more compromising version of NixOS?
Also just curious, aren’t you trading off package installation efficiency for efficiency during operation of the package? I have to imagine that it’s more expensive to access data through a FUSE program parsing squashfs images at runtime, than natively through the file system.
Absolutely, that’s a good observation!
To productionize this idea, it would probably make sense to implement this file system as a kernel module, not via FUSE.
That said, in my day-to-day, programs are typically loaded into memory once, and I don’t notice that being slower than on the non-distri computers I use. So I don’t feel any pain to rush from user space to kernel space :)
While that is a component of my premise, user/kernel space slow down isn’t primarily to what I was referring. From a high-level perspective, it would seem to me that traversing squashfs images isn’t as efficient as, for example, the ext4 on-disk format.
If you haven’t noticed a slowdown maybe it’s not a cause for concern, though I could see FS competitive performance becoming a bigger issue that may block adoption when cold-booting larger programs like Firefox / LibreOffice, etc.
Why is that? I haven’t looked into ext4’s on-disk format as much as I have into SquashFS’s, but it seems pretty efficient to me.
I’m using google-chrome regularly. Reading the few files it contains is not a big deal, even through my FUSE file system :)
You would know more, but I see two situations:
squashfs on-disk format is meant to be mapped directly into memory, and traversed directly. In that case, I would imagine that there is a tradeoff between the size of the squashfs image file, and the efficiency of traversing it once it is in-memory. E.g. redundancy of data usually allows faster in-memory data structures but not good for image file size.
squashfs on-disk format requires some pre-processing before it can be loaded into memory and traversed. This is different from ext4, which basically requires no pre-processing before being traversed.
According to those two situations, squashfs is either not efficient disk-space wise, or not efficient performance-wise (relative to e.g. ext4). Of course, it’s possible that squashfs is not as efficient as ext4, but close enough that it doesn’t matter. As long as cold-boot performance is comparable (e.g. not more than 1-5% slower), it probably will never bubble up as an issue.
SquashFS is actually pretty flexible. By default, it optimizes for size, but my implementation optimizes for easy access. There are still some wins like producing SquashFS’s directory index data structures.
Gotcha, thanks for entertaining my reasoning :)
Just curious, any reason you didn’t choose to use a combination of overlayfs and squashfs, e.g. mounting squashfs images natively, and using overlayfs to create the exchange directories?
I actually implemented it like that before implementing my FUSE file system! Turns out that adding new kernel mounts gets really slow really quickly. One of my kernel developer friends told me that mounts are linked lists that degrade pretty quickly.
Setting up all the mountpoints required to build moderately sized programs took many seconds. With the FUSE file system that can lazily mount these images (and mounts them more quickly), this setup is now done in the fraction of a second, which is a massive developer experience improvement :)
Additionally, overlayfs were pretty complicated to correctly manage programmatically, especially changing the composition of overlays, like when installing a new package.
This reminds me a little of Haiku’s packagefs system.
More than a little for me; I’m having trouble finding a feature that isn’t implemented by Haiku’s package manager. Not that that’s a bad thing; I’ve been wanting an equivalent for Linux ever since I’ve learned about it.
Thankyou for sharing stapelberg.
Something I have not seen mentioned elsewhere: Gobolinux. It doesn’t treat packages as mounted images, but it does have a philosophy of “program goes in one folder” and then “farm out symlinks into traditional paths”. I think they also have a kernel module + tool to hide paths (like /usr) from showing up in ls (of /).
https://gobolinux.org/at_a_glance.html
Sidenote: I’ve read a lot of comments shouting ‘NIH’ and ‘suspiciously similar’ on other sites. Ignore them, do what you want to do.
Thanks for the pointer! A few people have mentioned Gobolinux on twitter, too. I had read about it many years ago, and there are certainly some similarities. I like it when projects reinforce each other like that.
Thanks also for your note of support. I appreciate it!
Ever since I discovered Nix, I expected that it (or something like it) was the future of Linux packaging.
Now I’ve got two interesting up-and-coming packaging ecosystems to keep an eye on. :)
Yes, interesting observation! I thought everything ends up with apt, then Nix and now distri came along. Sometimes I wish to get something like this on FreeBSD.
This heading made me realize that’s what I use Docker for…
You mention that when there are conflicts that the package with the highest distri revision will take precedence, but are there plans to allow globally selecting a particular package version as the ‘default’? e.g. some distros provide tools that basically symlink /usr/bin/gcc-7–>/usr/bin/gcc to set the default gcc to 7 even if gcc-8 were installed.
As an aside, I had no idea you wrote i3. Wow, thank you for that!!
That can be achieved using e.g.:
It’s a bit low-level, but you typically only need that temporarily while working on a project that doesn’t build with your preferred compiler version.
For distri packages themselves, you would just only depend on one version or the other, and not have a conflict there.
Thanks, glad you like it!
Or just
PATH= /ro/gcc-amd64-8*/bin:$PATH make
Or is there a reason why you’d want gcc-8 and g++-7?
Sure, that’s even shorter. What I was hinting at with my approach is that you can designate a directory to hold and manage symlinks, which might be easier than to override
$PATH
ad-hoc when you need to.No, no specific reason other than I didn’t pay attention :)
Ok. Btw, this looks really great. I’m convinced packaging/modules isn’t even close to a solved problem, so I’m really excited to see exploration in this space.
Thanks! I appreciate the kind words.
Really interesting project. I would jump from Debian if/when this project is ready for daily use. The ability to install multiple versions of a package is crucial for experimentation and local package development. Atomic updates are another feature that would give me less anxiety before upgrading.
One question, how is this different/better than NixOS?Answered elsewhere, thanks![Comment from banned user removed]
What would you prefer over Go?
[Comment from banned user removed]
Why is Go a blocker? It’s not my favorite language, but it is a safe language, produces static binaries, and is fast enough.
[Comment from banned user removed]
I would classify it as a conservative language, not an anti-intellectual language.
[Comment from banned user removed]
This thread is a perfect example of people downvoting because they dislike something someone said. Shameful.
Rust has a terrible bootstrap from source story if you care about minimizing binary blobs you have to download from the internet.
Package managers care a lot about that sort of thing, and Go has a pretty good bootstrap story.
Go is a slight turn off to me as well, but that objectively probably doesn’t mean much since I would prefer if it were written in C. Mostly for consistency with the rest of Linux infrastructure.
I disagree — while we have to retain C for backwards compatibility (probably for decades to come, maybe even a century), it’s well past time to use a language with fewer footguns. I’d prefer Lisp (of course), you prefer Rust, but honestly Go is a decent language in which one can get work done without excessive boilerplate and pain.
IMHO Go is probably the ideal language for tooling like this in the real world (as opposed to my ideal world, in which everything is a Lisp machine).
With what do you disagree? That’s my preference, I’ve made no claim. Write your software in Go or Lisp or whatever you want, I just won’t use it.