I’ve said it before and I’ll say it again: ZFS should be the default on all Linux distros. It’s in a league of its own, and makes all other existing Linux filesystems irrelevant, bizarre licensing issues be damned.
I use ZFS and love it. But I disagree that ZFS should be the default as-is. It requires a fair bit of tuning. For non-server workloads, the ARC in particular. ZFS does not use Linux’ buffer cache and while ARC size adapts, I have often seen on lower memory machines that the ARC takes too much memory at a given point, leaving too little memory for the OS and applications. So, most users would want to tune zfs_arc_max for their particular workload.
I do think ZFS should be available as an option in all Linux distributions. It is simply better than the filesystems that are currently provided in the kernel. (Maybe bcachefs will be a competent alternative in the future.)
I remember installing FreeBSD 11 once (with root on ZFS) because I needed a machine remotely accessible via SSH to handle files on an existing disk with ZFS.
No shizzle, FreeBSD defaults, the machine had 16G of RAM, and during an hours long scp run, ARC decided to eat up all the memory, triggering the kernel into killing processes… including SSH.
So I lost access, had to restart scp again (no resume, remember), etc. This is a huge show stopper and it should never happen.
That seems like a bug that should be fixed. Don’t see any reason why that should prevent it from being the default though.
That’s definitely something to consider, however, Apple has made APFS (ZFS inspired) the default on macOS, so there’s got to be a way to make it work for ZFS + Linux Desktop too. ZFS is all about making things work without you having to give it much thought. Desktop distros can pick reasonable defaults for desktop use, and ZFS could possibly make the parameter smarter somehow.
I think the licensing issue is the primary problem for Linux distros.
I agree on technical superiority. What about the Oracle threat given its owner pulled off that API trick? Should we take the risk of all owing Oracle’s lawyers money in some future case? Or rush to implement something different that they don’t control with most of its strengths? I think the latter makes the most sense in the long-term.
Oracle is not a problem, as the ZFS license is not being violated – it is the Linux license.
“Oracle is not a problem, as the ZFS license is not being violated”
That’s a big claim to make in the event large sums of money are ever involved. Oracle threw massive amounts of lawyers at Google with the result being API’s were suddenly a thing they could copyright. Nobody knew that before. With enough money and malicious intent, it became a thing that could affect FOSS developers or anyone building on proprietary platforms. What will they do next?
I don’t know. Given they’re malicious, the safest thing is to not use anything they own or might have patents on. Just stay as far away from every sue-happy party in patent and copyright spaces. Oracle is a big one that seeks big damages for its targets on top of trying to rewrite the law in cases. I steer clear of their stuff. We don’t even need it, either. It’s just more convenient than alternatives.
The CDDL, an OSI-approved open source licensed, includes both a copyright and patent grant for all of the code released by Sun (now Oracle). Oracle have sued a lot of people for a lot of things, but they haven’t come after illumos or OpenZFS and there are definitely companies using both of those bodies of software to make real money.
I think you’re missing the implications of they effectively rewrote the law in the case I referenced. If they can do that, it might not matter what their agreements say if it’s their property. The risk might be low enough that it never plays out. One just can’t ever know if they depend on legal provisions with a malicious party that tries to rewrite laws in its favor with lobbyists and lawyers.
And sometimes succeeds unlike basically everyone doing open source and free software. Those seem to barely enforce their agreements and/or be vulnerable to patent suits in case of the permissive licenses. Plus, could the defenders even afford a trial at the current rates?
I bet 10 years ago you wouldn’t have guessed a mobile supplier using an open-ish platform would be fighting to avoid giving over $8 billion dollars to an enterprise-focused, database company. Yet, untrustworthy dependencies let that happen. And we got lucky it was a rich company that depended on OSS/FOSS stuff defending. The rulings could’ve been worse for us if it wasn’t Google.
Seeing as Sun gave ZFS away before Oracle bought it, Oracle would have a LOT of legal wackiness to get the CDDL license revoked somehow. But for the safe of argument, let’s assume they do manage somehow to make it invalidated, and went nuts and decided to try and charge everyone currently using ZFS pay bajillions of dollars for “their” tech. Laws would have to change significantly for that to happen, and with such a significant change in current law, there is basically zero chance it would be retro-active from the moment you started using ZFS, so worst case you’d have to pay from the time of the law change. That is if you didn’t just move off of ZFS after the law changed and be out zero dollars.
Also, the OSS version of ZFS is significantly different from Oracle’s version that they are sort of kissing cousins at best anymore. ZFS has been CDDL licensed since 2005, so a long history of divergence from the Oracle version. I think Oracle would have a VERY hard time getting the OSS version back under the Oracle banner(s). Even with very hypothetical significant law changes.
I’m in favour of things competing against ZFS, but currently nothing really does.. BTRFS tries, but their stability record is pretty miserable for anything besides the simplest workloads. ZFS has had wide production usage since 2001. Maybe in another 5 or 10 years we will have a decent stable competitor to some of ZFS’s feature-sets.
But regardless if you are a large company with something to lose, your lawyers will be the ones advising you about using ZFS or not, and Canonical’s lawyers clearly decided there was nothing to worry about, Along with Samsung(who own Joyent, the people behind Illumos). There are also many other large companies that have bet big on Oracle having basically zero legal leg to stand on.
Of course the other side of the coin is the ZFS <-> Linux marriage, but that’s easy just don’t run ZFS under Linux, or use the Canonical shipped version and let Canonical take all the legal heat.
Best counterpoints so far. I’ll note this part might not be as strong as you think:
“and Canonical’s lawyers clearly decided there was nothing to worry about, Along with Samsung(who own Joyent, the people behind Illumos)”
The main way companies dodge suits is to have tons of money and patents themselves to make the process expensive as hell for anyone that tries. Linux companies almost got patent sued by Microsoft. IBM, a huge patent holder, stepped up saying they’d deal with anyone that threatened it. They claimed they were putting a billion dollars into Linux. Microsoft backed off. That GPL companies aren’t getting sued made Canonical’s lawyers comfortable but not an actual assurance. Samsung is another giant, patent holder with big lawyers. It takes an Apple-sized company to want to sue them.
So, big, patent holders or projects they protect are outliers. That might work to ZFS’s advantage here. Especially if IBM used it. They don’t prove what will happen with smaller companies, though.
I agree with you in theory, but not in practice because of the CDDL (which ZFS is licensed under). This license explicitly grants a “patent peace” see: https://en.wikipedia.org/wiki/Common_Development_and_Distribution_License
I know most/many OSS licenses sort of wimp out on patents and ignore the problem, CDDL doesn’t. Perhaps it could have even stronger language, and there might be some wiggle room for some crazy lawyering.. I just don’t really see Oracle being THAT crazy. Oracle, being solely focused on $$$$, would have to see some serious money bags to go shake loose, I doubt they would ever bother going after anyone not the size of a Fortune 500, the money just isn’t there. Google has giant bags full of money they don’t even know what to do with, so Oracle trying to steal a few makes sense. :P
Oracle going after Google makes sense knowing Oracle, and it was , like you said, brand new lawyering, trying to create API’s out of Copyrights. Patents are not remotely new. So some lawyer for Oracle would have to dream up some new way to screw up laws to their advantage. Possible sure, but it would be possible for any other crazy lawyer to go nuts here (wholly unrelated to ZFS or even technology), it’s not an Oracle exclusive idiocy. Trying to avoid unknown lawyering that’s not even theoretical at this point would be sort of stupid I would think… but I’m not a lawyer.
“I know most/many OSS licenses sort of wimp out on patents and ignore the problem, CDDL doesn’t.”
That would be re-assuring on patent part.
“Possible sure, but it would be possible for any other crazy lawyer to go nuts here (wholly unrelated to ZFS or even technology), it’s not an Oracle exclusive idiocy. Trying to avoid unknown lawyering”
Oracle was the only one to flip software copyright on its head like this. So, I don’t think it’s an any company thing. Either way, the threat I’m defending against isn’t unknown lawyering in general: it’s unknown lawyering of a malicious company whose private property I may or may not depend on. When you frame it that way, one might wonder why anyone would depend on a malicious company at all. Avoiding that is a good pattern in general. Then, the license negates some amount of that potential malice for a great product with unknown, residual risk.
I agree the residual risk probably won’t affect individuals, though. An Oracle-driven risk might affect small to mid-sized businesses depending on how it plays out. Good news is swapping filesystems isn’t very hard on Linux and BSD’s. ;)
AFAIK, it’s the GPL that’s being violated. But I’m really tired and the SFC does mention something about Oracle suing so 🤷.
Suing based on the use of works derived from Oracle’s CDDL sources would be a step further than the dumb Google Java lawsuit because they haven’t gone after anyone for using OpenJDK-based derivatives of Java. Oracle’s lawsuit-happy nature would, however, mean that a reimplementation of ZFS would be a bigger target because it doesn’t have the CDDL patent grant. Of course, any file system that implements one of their dumb patents could be at risk….
I miss Sun!
What does ZFS have that is so much better than btrfs?
I’m also not sure these types of filesystems are well suited for databases which implement their own transactions and COW, so I’m not sure I would go as far as saying they are all irrelevant.
ZFS is extremely stable and battle-tested, while that’s not a reason in itself to make it a better filesystem, it makes it a extremely safe option when what you’re looking for is something stable to keep your data consistent.
It is also one of the most cross-platform file system. Linux, FreeBSD, MacOS, Windows Illumos. It has a huge amount of development behind it, and as of recently the community has come together significantly across the platforms. Being able to export your pool on FreeBSD and import it on Linux or another platform makes it a much better option if you want to avoid lock-in.
Additionally, the ARC
Problems with btrfs that make it not ready:
If I don’t use/want to use RAID5 then I don’t see the problem with btrfs.
I ran btrfs in production on my home server for ~3-4 years, IIRC. If you want to use btrfs as a better ext4, e.g. just for the compression and checksumming and maybe, maybe snapshotting, then you’re probably fine. If you want to do anything beyond that, I would not trust it with your data. Or at the very least, I wouldn’t trust it with your data that’s not backed up using something that has nothing to do with btrfs (i.e. is not btrfs snapshots and is not btrfs send/receive).
I had three distinct crashes/data corruption problems that damaged the filesystem badly enough that I had to back up and run mkfs.btrfs again. These were mostly caused by interruptions/power failures while I was making changes to the fs, for example removing a device or rebalancing or something. Honestly I’ve forgotten the exact details now, otherwise I’d say something less vague. But the bottom line is that it simply lacks polish. And mind you, this is from the filesystem that is supposed to be explicitly designed to resist this kind of corruption. I know at least the last case of corruption I had (which finally made me move to ZFS) was obviously preventable but that failure handling hadn’t been written yet and so the fs got into a state that the kernel didn’t know how to handle.
well, I don’t know about better, but ZFS has the distinct disadvantage of being out of tree filesystem so it can and will break depending completely on the whims of kernel development. How anyone can call this stable and safe for production use is beyond me.
I think the biggest argument is mature implementations used by large numbers of people. That catches lots of common and uncommon problems. In reliability-focused filesystems, that the reliability is field-proven then constantly maintained is more important to me than about anything. The only reason I don’t use it is that it came from Oracle with all the legal unknowns that can bring down the line.
When you say “Oracle”, are you referring to ZFS or btrfs? ;)
Oh shit! I didn’t know they designed both! Glad I wasn’t using btrfs either. Thanks for the tip haha.
On a practical level, ZFS is a lot more tested (in Solaris/Illumos, FreeBSD, and now Linux); more different people have put more terabytes of data in and out of ZFS than they seem to have for btrfs. This matters because we seem to be unable to build filesystems that don’t run into corner cases sooner or later, so the more time and data a filesystem has handled, the more corner cases have been turned up and fixed.
On a theoretical level, my personal view is that ZFS picked a better internal structure for how its storage is organized and managed than btrfs did (unless btrfs drastically changed things since I last looked several years ago). To put it simply, ZFS is a volume manager first and then a filesystem manager second (on top of the volumes), while btrfs is (or was) the other way around (you manage filesystems and volumes are a magical side effect). ZFS’s model does more (obvious) violence to Linux IO layering than I think btrfs’s does, but I strongly believe it is the better one and gives you cleaner end results.
Why would I want to run ZFS on my laptop?
Why wouldn’t you want to run it on your laptop?