1. 24
  1.  

  2. 21

    We can see that very practically using standard linux / programming tools. While you can use cat to show the data of a file, it won’t work for directories

    That’s a relatively recent addition. Early versions of UNIX exposed directories just like any other file. Before the APIs were added for traversing directories in a filesystem-independent way, userspace would just open directories and parse them directly. Even on modern systems, the ‘data can be stored only in leaf nodes’ part is very dependent on the filesystem and the definition of data. In NTFS and HFS+, I think you can store alternate data streams on directories and on most *NIX systems you can set extended attributes on directories.

    If you’re thinking of files and folders using their actual analog equivalents

    This is why I like to differentiate between a directory and a folder. A directory is (like a telephone directory) a key-value store that indexes from some human-friendly names to something a computer uses. On the original UNIX filesystems, it was exactly that: a file that used a fixed-sized record structure to map from names to inode numbers.

    In contrast, a folder is a UI abstraction that represents a container of documents.

    The fact that folders are typically implemented with a 1:1 mapping to directories is an implementation detail. In BFS, for example, both folders and saved searches were implemented as directories.

    The fact that inner nodes in a file system hierarchy cannot hold data is limiting. In many use cases, it’d be natural to have nodes that can hold data AND have children. Because file systems don’t allow this

    The problem is not that filesystems don’t allow this, it’s that there are no portable interfaces to filesystems that allow with. People want to be able to deploy their persistent data structures onto ext4, ZFS, UFS2, APFS, HFS+, NTFS, and sometimes even FAT32 filesystems and access it with SMB, WebDAV, or NFS (or AFS or AFP) network shares. This means that the set of things that you can use from any given filesystem is the intersection of the features provided by all of these things.

    I was expecting from the title for the author to realise that filesystems are not trees, but apparently this didn’t happen. Filesystems are DAGs if you’re lucky, graphs if you’re not. Hard links make a filesystem a DAG because the same file can be in multiple leaf nodes. Hard links on HFS+ are allowed (if you’re a sufficiently privileged user) to point to directories, which allows them to be an arbitrary graph (the ‘sufficiently privileged’ part is because a lot of things break if they are and so this privilege is granted only to things like the Time Machine daemon that promise not to create cycles). Junctions / reparse-points on NTFS can also create cycles, as can symlinks (though in both cases, at least you know that you’re potentially entering a cycle).

    1. 3

      We can see that very practically using standard linux / programming tools. While you can use cat to show the data of a file, it won’t work for directories

      That’s a relatively recent addition. Early versions of UNIX exposed directories just like any other file. Before the APIs were added for traversing directories in a filesystem-independent way, userspace would just open directories and parse them directly. Even on modern systems, the ‘data can be stored only in leaf nodes’ part is very dependent on the filesystem and the definition of data. In NTFS and HFS+, I think you can store alternate data streams on directories and on most *NIX systems you can set extended attributes on directories.

      IIRC it still worked on FreeBSD versions as recently as 11. In most traditional Unix filesystems (at least I believe it’s the case for [UF]FS and ext{,2,3,4}) a directory is stored much the same way as a file, the only difference is a flag that indicates the blocks pointed to by the inode are used to store directory entries rather than file content. The Rust equivalent is probably closer to:

      struct DirEnt {
          name: String,
          node: Rc<Inode>
      }
      
      enum Inode {
          File: Vec<u8>,
          Dir: Vec<DirEnt>
      }
      

      Note the Rc as well, because an inode might be pointed to by more than one directory entry.

      1. 1

        Can confirm this worked in recentish FreeBSDs. Made some nice file read vulnerabilities funnier to exploit :)

        1. 1

          Modern Unix filesystems use more complicated on-disk structures for at least large directories, because they want to be able to get a specific directory entry (or determine that it doesn’t exist) without potentially having to scan the entire directory. The increased complication of the actual structure of directories and directory entries is one reason why Linux moved to forbidding reading directories as regular files.

          (Some directories haven’t been readable as regular files for a long time. One example is directories on NFS mounts; from the beginning of NFS in the 80s, you could only look at them with NFS directory operations.)

          1. 1

            The increased complication of the actual structure of directories and directory entries is one reason why Linux moved to forbidding reading directories as regular files.

            Did Linux ever allow it? I was under the impression that it had been distinguishing feature between it and the BSDs quite early on.

            1. 1

              Based on looking at the source code of 0.96c on www.tuhs.org, it appears that early Linux allowed reading directories with read(). ext_file_read() in fs/ext/file.c, appears to specifically allows reads of S_ISDIR() inodes, although 0.96c also has readdir() and friends.

        2. 6

          Even better: get rid of the hierarchy in favor of flat storage with tags and indexed search. This is already a reality in Gmail and macOS. And S3 and other blob stores are similar.

          Either option requires rethinking all of our existing tools, though, especially the ones for software development, so I’m not holding my breath.

          1. 12

            My blog has twenty years of posts (5,200+) and I have about twice as many tags (10,000+) as I do posts, and I still have trouble finding stuff via tags. Often times, when I write a post, the tags I include won’t be the tags I expect them to be when I go looking for the post in the future. It’s hard to properly tag stuff, and I have about a 50/50 shot of find a post via the tags I used.

            Just an observation I have.

            1. 2

              But it’s still better than a heirarchy

            2. 6

              I like hierarchies though! I want the “science” tag to include everything in the “physics” tag and also a few other things. Hierarchical tags FTW

              1. 3

                Even better: Logic programming. In addition to the hard-coded tags, we should be able to write a set of prolog / datalog rules like Tagged(X, "science") :- Tagged(X, "physics") to automatically assign “soft” tags. This would make it really easy to tag and re-tag things, and it would make searching really flexible!

              2. 4

                See “Designing better file organization around tags, not hierarchies”.

                I’m mildly obsessed with this type of idea. The poster child is Apple’s Newton, which didn’t have a filesystem at all, rather a shared graph database called the “soup”.

                Also, I’ve been building a b-tree and hash based storage manager, and ruminating that, from a database perspective, a filesystem is just a shitty database with a very limited schema and a terrible lack of durability (not to mention A, C and I, too.)

                1. 2

                  I’ve been pondering this idea on and off for years! It has occurred to me that you might actually be able to bridge the gap by treating a path, like “/home/george/foo.txt” as referencing a file with tags “home”, “george”, and “foo.txt” (or maybe “foo” and “.txt”). The only problem is that a tag-based filesystem could contain more than one file with those tags, so maybe you’d just take the first one or something. It’s definitely not perfect.

                2. 3

                  Zookeeper works this way, in case you want to see a system like this in the wild.

                  1. 2

                    It’s worth pointing out that not all filesystems follow the same patterns - famously, NTFS gives each file a “default” data stream and other named data streams, so a file can also be a kind of directory. Going in the other direction, I believe an old version of Solaris (SunOS?) would let you cat a directory and see raw dentries, so a directory can also be a kind of file.

                    1. 2

                      A filesystem may not match the simplest type of tree structure, but there are common types of trees that only allow content in leaf nodes — such as the B+tree that’s ubiquitous in databases. (The HFS family of file systems are based on B+trees too, but it’s not a simple mapping, i.e. the tree nodes don’t directly correspond to the file hierarchy.)

                      Also, what @david_chisnall said about filesystems being a DAG, not a tree.