1. 21
  1. 2

    For another take on naming systems, I like The Hideous Name

    1. 2

      That’s an interesting piece of history there by Pike and Weinberger. They mention some things about the “Ninth Edition” of UNIX, which I guess went into Plan 9?

      They were very in favor the the UNIX convention of slash-separated identifiers. These types of names only seem simple. They are deceptively complicated. Just look at how many bugs there have been over the years. Web servers that are tricked by /../../../../etc/passwd, archive tools that write to arbitrary directories. CWE-22 is ranked 8 in the CWE top 25 this year. Filenames are just slash-separated identifiers, so you easily think that you just need to split on / and call it a day. If you want to append two Unix paths and you use a string append to do it then it will appear to work, but you probably have a serious bug.

      What I think this comes down to is a shell-centric mindset. When you have a text-based shell then you need these little strings that get reinterpreted by the kernel into something that it can actually work with. Filenames are a DSL for referring to things in the filesystem, and IMHO it’s a pretty bad DSL we’ve ended up with. It is obviously insufficient or we wouldn’t have syscalls like Linux’s openat2(2), which modify the implicit semantics of the DSL.

      The paper argues against anything that is not Unix. It’s also arguing against the Internet and Internet mail.

    2. 1

      I bet CP/M-86 would have developed almost exactly the same way as MS-DOS did — backslash directory separators, HIMEM.SYS, A20 gate, and all. They had very similar capabilities, and neither one had any real backward compatibility with the O.G. 8080 CP/M.

      1. 3

        I suspect that it would depend a log on how the filesystem evolved. I prefer to separate the concept of a directory from a folder when discussing these things:

        • A directory is a name to file map that is stored in a filesystem.
        • A folder is a UI concept for containing a set of documents.

        On most modern systems, each folder is a UI representation of a directory in the filesystem but that doesn’t have to be the case. A lot of mainframe filesystems provided a flat namespace in the filesystem and allowed folders to be built out of userspace conventions. If you wanted to look in directory /foo/bar (using UNIX path conventions) then you’re really doing a search for all files matching the pattern /foo/bar/[^/]*.

        The UNIX filesystem didn’t want to implement rich search semantics and so built trees, where you have a file that contains a name to inode map and marked it as a special file. Lookups in this structure involved a linear scan (I think sorting was added later so that it became a binary search) and so performance got a lot worse if a directory had a lot of files in it. This forced users to adapt their organisational structure to the implementation details of the filesystem - something that the older mainframe filesystems explicitly didn’t want.

        It’s entirely possible that CP/M would not have followed the UNIX and DOS model of conflating folders and directories and instead provided a more database-like storage layer.

        1. 3

          Couple of examples:

          Windows folders (the UI concept) are independent of filesystem directories. Derived from Cairo’s object-oriented UI, a folder in the Windows explorer is a COM object that may or not be related to something in the filesystem. Of course an awful lot of them are 1:1 with directories, but especially in modern Windows the main ones you interact with (Documents etc.) are not.

          Original Mac OS didn’t have a hierarchical filesystem. Folders were implemented as a UI layer feature in the Finder and the metadata for that was stored in a hidden file on each disk volume. It was common for this to get confused, so you could hold the option key (I think) when inserting a disk to wipe that file and start over with a flat structure. This also meant files had to have unique names over the whole disk. With 400k disks this wasn’t a big deal.

        2. 1

          Why?

          Using the backslash for this was an innovation in MS-DOS 2. It wasn’t in DOS 1.

          I’m not aware of any other OS family that uses backslashes for this except DOS 2 and its descendants: the many versions of OS/2, DOS-based Windows, and NT-based Windows, including ReactOS.

          I don’t think there’s any kind of historical inevitability to its use.

          Classic MacOS didn’t really let you type paths or see them as text, but internally, they were occasionally represented using colons, I believe:

          Mac HD:Documents:MS Word:Letters

          VMS, like I said in the post, used dots, but delimited inside square brackets:

          [USERS.LPROVEN.SOURCE.FORTRAN]

          … the brackets are important because outside of them, the dot separates filename from extension.

          Acorn RISC OS also uses dots:

          HardDisc4.$.!boot.Choices

          UNIX-likes use forward slashes.

          The point being, there was a lot of variation in the ’70s and early-to-mid ’80s.

          It was in the 1990s that the Great Extinction Event of OS design occurred, and the industry settled on two siblings, both built in C according to designs based on the DEC PDP-11 and VAX: UNIX and Windows NT.

          1. 1

            Why backslash instead of slash?

            1. 6

              [Blogpost author here]

              This was analysed in depth by the OS/2 Museum:
              http://www.os2museum.com/wp/why-does-windows-really-use-backslash-as-path-separator/

              1. 1

                Why backslash instead of dollar?

                1. 1

                  I can only guess, but I can give you two guesses:

                  [1] The $ sign is a shifted number on IBM’s US layout.

                  [2] Both CP/M and MS-DOS are intimately intertwined with BASIC – both DR and MS sold BASIC interpreters and both OSes came bundled with BASIC. In BASIC, the $ sigil on the end of a variable means that the variable holds a string. Perhaps overloading such an important symbol was seen as being confusing?

                  1. 1

                    [2] would also explain why the division operator wasn’t used as path separator.

                    1. 1

                      :-) Well, it could, but there are only so many characters available without Shift or Alt.

                      Some OSes used dots, some slashes in one direction or another. Opening and closing single quotes mean something else in xNix, although when that was settled I don’t know. Semicolon was the version number separator on some DEC OSes. And that’s about all you had on a USA-layout Model F keyboard.