This post reminded me of the (very nascent) nushell project. I don’t think it directly addresses the OP’s criticism of files—certainly not at the OS level—but it’s similarish in that it’s trying to expose the structure of files to users.
I do kind of find ideas like this interesting, but it’s so hard to analyze their trade offs without actually building the system and getting real users. One of the things I would personally be interested in are the performance ramifications of such things. Usually when you move up the abstraction ladder, you wind up having to pay for it somehow. Where is that cost here, and if it had existed decades ago, what kinds of things would it have prevented from happening? What I mean to say is that while “a sequence of bytes” doesn’t give you much, it does at least provide flexibility. What have we bought with that flexibility?
@shalabh, if you’d like specifics on how this sort of thing is handled in various Xanadu systems, I can provide them.
Generally speaking, we have some an abstraction layer where ‘documents’ are composed out of underlying objects of some kind, where ‘documents’ are arbitrary virtual byte-streams assembled on the client side (usually piecemeal & lazily). Documents themselves look like bytestreams, except that the information about the original address of any given byte is accessible, and this lets us look up ‘formatting links’ or ‘overlays’ (which contain information equivalent to MIME for non-text but contain font, size, emphasis, or color information for text). Depending on the particular implementation, the underlying structure this abstracts from could be a set of conventional files addressed by conventional URLs (as with XanaduSpace, XanaSpace, OpenXanadu, and XanaduCambridge) or dot-separated numeric indexes for traversing a tree structure (as with Udanax Green / XU88 and Udanax Gold / XU92).
The general idea is that, for the sake of avoiding duplication, avoiding single failure points, enabling access control / monetization schemes, and enabling simple-to-implement but powerful tree versioning, we introduce an abstraction layer between underlying bytes & documents that is mostly a matter of ordering and reordering lists of fat pointers, but that users are not generally expected to be aware of the underlying abstraction unless we run up against one of a handful of points at which it intentionally leaks (for instance, if I do not have permission to view a piece of text, the span of bytes corresponding to it will be either blanked out or scrambled by a one-time pad).
I can provide more details if there’s interest. It’s tangential to the topic, but Xanadu is an example of an existing functioning system that breaks the UNIX philosophy of flat files in favor of more structure (while maintaining an opposition to hierarchical structure).
@enkiv2 - thank you, I’m quite interested in various aspects of Xanadu. I have been recently thinking about how hypermedia could be the primary model of interacting with a computer, with other features (such as programming languages, dynamic media) built on top; rather than hypermedia itself being a feature built on other stuff. The idea is to emphasize and represent the interconnected nature of all information and enable easy interlinking of objects. The vision is this: if I open my computer I don’t have files or apps, but just documents/objects. I can interact with these in place (they’re more than static documents) but also chase links from these to other objects. These links could be the history (older versions of same object), constituents (parts of the object), provenance (‘source objects’), etc. I’m wondering if there is a similar set of ‘link types’ defined in Xanadu? A related question is if Xanadu has something like the web URL - a user visible chunk that identifies a document. Finally, once Xanadu assembles the bytestream for a document, how does it know how to render it? Does it support multiple views of the same document?
I’m wondering if there is a similar set of ‘link types’ defined in Xanadu?
In Xanadu, link types are freeform strings & there are optional client-side rules for dealing with specific type strings. If a string doesn’t have specific rules associated (i.e., if it’s not a ‘formatting link’) then it’s used to generate a highlight color (which the user can also override). This way, link types can be used for freeform tagging / categorization or (for links with two or more targets) as a kind of predicate & you get automatic color coding.
A link can also connect to an arbitrary document, of course, and every version of a document is stored permanently so revision history (stored cheaply because content isn’t duplicated) is available. But, generally there’s the expectation that you can view the original context of transclusions (which are not links but should be displayed like links when you ask) & from there, be able to navigate to all the contexts you have access to.
A related question is if Xanadu has something like the web URL - a user visible chunk that identifies a document.
In XU88, there’s a namespace for assembled documents (the POOMfilade) versus chunks of text (the source enfilade). In XanaSpace, ordinary URLs are used & there’s the expectation that files ending in .xanadoc or .edl should be rendered (and files ending in .odl should have their links imported into the current context) by default. I was in favor of being able to write links against an assembled concatext by using the EDL’s address, so that it’s possible to do context-specific formatting, but I was never able to convince Ted.
Finally, once Xanadu assembles the bytestream for a document, how does it know how to render it?
In the absence of formatting links, we assemble as plain text & basically just simulate a text stream. (The expectation is to use word wrapping rather than hard newlines for ease of editing in XanaSpace, but otherwise, it’s just going to act like a text viewer.) The implementations I worked on didn’t get far enough to have implementations of the type-specific rendering rules or to have the plugin system for custom formatting links.
There was also the idea that you could hint type within a span declaration so that you could declare it during a normal link or transclusion & use type-specific start and length information – for instance, a section of an audio file in terms of start and end time, or an image in terms of bounding box, or a video in terms of a combination of both. We put off the details of this, because it was unclear how to ensure that only relevant portions of the file would be downloaded in these cases, and because in the easiest cases for XanaSpace we would be hitting http endpoints that support the w3c media fragments standard that expose this anyhow.
Does it support multiple views of the same document?
Yeah. All links basically go into an ‘overlay’ that’s like a modpack, and you can bring up a list of all links in various orders and turn them on and off for particular documents or source URLs or globally. So, you can have different sets of formatting rules for a document and switch between them. We had backend support for this but I’m not sure if any implementation actually exposed a UI for it, though this feature was certainly intended.
(A lot of criticisms of xanadu are based on the assumption that everybody has direct access to the same global namespace of links, which are on all the time. It makes more sense to see links as something like themes or mod packs or web fonts. A document can recommend a set of links & bring them in, but all resident & enabled links, regardless of source, will be applied to any document they can be applied to. This is great if you’re an academic in a field with a canon: if you spend a lot of time reading and writing commentaries on Canturbury Tales then your copy of Canterbury Tales will be a nexus of all sorts of overlapping links and annotations that accumulate as you get more commentaries. But, somebody can never impose links or formatting upon you in a way that you can’t disable.)
I also wonder if there was any ‘computation’ medium built or imagined on top of Xanadu which would expose the Xanadu structures as objects and let you build dynamic views? Or was it mostly about static text and images kind of mediums?
There was definitely the expectation of dynamic views. Details of how to do the plugin architecture were never really stable for mainline translit, but on the zigzag side we had various prototypes for this. Since in XanaSpace & XanaduSpace, we used zigzag for modeling the display, there was this idea that we might have zigzag-oriented code modifying the display information (sort of like javascript modifying a DOM or like writing applications in Display Postscript on NeWS). That said, the actual code never got written (and we had no fewer than 3 incompatible systems for scripting in zigzag that got implemented), & all implementations after XanaSpace didn’t use ZZOGL at all, so it’s all pretty abstract.
Interesting stuff. Thanks for taking the time to answer my questions!
Tangentially, I wonder if you’ve used or have opinions on roamresearch.com. I ask because it’s the only writing system where I found a form of transclusion as well as fine grained versioning. There’s no collaborative aspect though.
I’ve been looking at Roam as well. This isn’t pertinent to the article, but my biggest concern is that a third party company has access to my notes, and vendor lock-in to some degree. They do provide export options, but once I’ve accumulated years of interconnected notes, I wonder how functional that will be. I’d love an open-source alternative that can be self-hosted for these kinds of projects that deal with highly personal details.
Roam is also the only note-taking program I’ve come across that attempts to break free of the tyranny of linear progression, a paradigm that simply isn’t in accord with how the mind works, and one that I think has been highly limiting in the past. That’s why pen and paper have tended to win thus far - they allow you to capture the highly networked stream of consciousness in a form that is least distorted, as opposed to pretty much all existing text editors that force you to retrofit this networked stream of thoughts into a neat linear progression, with tremendous loss of insight.
Unix’s “bag of bytes” attitude towards files was a refreshing change from what came before. No longer did you have to presize your files. No longer were you restricted to fixed sized records (even for things we would call “text files”—image every line having to be the same size). No longer were you restricted from reserving space on the disk for your file. It even simplified access to files, you could read as much or as little as you wanted and not some fixed amount (ala CP/M or MS-DOS 1.0).
And there have been attempts to bring some order to this. Back in 1985, Electronic Arts (yes, the gaming company) developed IFF (Interchange File Format) and it caught on quite a bit on the Amiga. It comes close to what you want (blocks of data are self-delimiting so it’s easy to skip what you don’t understand) but it doesn’t come quite to the granularity you probably want for a self-describing data format. The only file type that can still trace its format back to IFF that is still in use is PNG (only they got it somewhat backwards, of course).
One criticism I have is that criticism itself is cheap—you aren’t the only one to rant about the limitations of . What’s lacking is an actual attempt to address these limitations. What’s stopping you from just doing it? Create files in your image and convince others to use it. Initially it’ll be on top of an existing implementation but it will be something to work with.
The only file type that can still trace its format back to IFF that is still in use is PNG
Apple’s AIFF uncompressed audio format is basically IFF, although it’s not used as much in this modern age of lossy audio compression.
Microsoft’s RIFF format is basically “little-endian IFF”, and is the basis of WAV and AVI formats. Also, Wikipedia tells me, Google’s WebP format is based on RIFF.
What’s lacking is an actual attempt to address these limitations
That is indeed the long term goal. However I don’t have a clear picture of a solution. Having more structured datastore instead of files sounds better but doesn’t significantly move the needle. I don’t think we should be designing the file structure independent of the system as a whole. The question is how meaning and information is represented across the system. Whether we have bytes or records, the deeper problems are how we associate them with meaningful views, while minimizing pre-shared information.
Forgive my ignorance, but that is the difference between the “link” you wrote about and Unix hard/symbolic links? I mean, as you have mentioned in the thought experiment, we could let the system to pre-define a special class of byte sequences to signal hard/symbolic links to other files (which are also byte sequences themselves, thus allowing nesting) so that applications do not need to invent new wheels. In this setting, when you attempt to read a file, the system will recursively replace such in-place links with the corresponding byte sequences, and I suppose this would grant files some hierarchical structure? It seems that all the above can be trivially built upon the existing hard/symbolic links scheme, so I am a little confused.
The main difference is that these embedded links would not be auto replaced by the OS. Rather the API itself would expose the structure - making it different from the current file reading API. So instead of something like
read() -> bytes
You’d have something like
read() -> List[bytes | link]
(This is not really well-designed but just shows the core difference.) In the first scheme, files are streams of bytes and that is the primary model for information storage. So if an editor opens a file, it just gets one stream of bytes and displays them. This is the current model.
In the thought experiment, when the editor opens a file, it does not get a stream of bytes to display, rather it gets a list of chunks and links to other files (also lists of chunks/links..). So it would be built to display and allow navigating this structured interconnected graph, possibly using a list view. Similarly when designing programming languages, you’d have this other option for representing links - instead of designing import path.to.file` syntax you could use the file link. You could also, for instance, design it to have one function per file and a module file that contains only links to each of the function files. Interestingly, this structure would ‘just work’ in any editor designed for this scheme.
To put it another way, in the thought experiment, the OS natively supports representing rich interconnected structures, while in the current model the OS supports a hierarchical namespace of disconnected bytes.
Can this be implemented on top of what we have today? Yes of course, but the interplay with other tools is at the file level, so it doesn’t provide the same benefit.
That makes sense! Basically, there is a missing layer of abstraction on top of stream-of-bytes. How this layer should be defined seems hard for people to agree upon, though, as indicated by the diversity of link representations adopted by different file formats.
Will just having an agreed upon graph structured abstraction be sufficient? I don’t think so - I think we’ll just slightly push the problem one level up. E.g. we might still worry about syncing information across machines, the format of bytes that are stored and so on. We need to think about how to represent information across the whole system, specially considering distribution and how we can minimize the pre-shared knowledge that is needed to transmit the information.
This post reminded me of the (very nascent) nushell project. I don’t think it directly addresses the OP’s criticism of files—certainly not at the OS level—but it’s similarish in that it’s trying to expose the structure of files to users.
I do kind of find ideas like this interesting, but it’s so hard to analyze their trade offs without actually building the system and getting real users. One of the things I would personally be interested in are the performance ramifications of such things. Usually when you move up the abstraction ladder, you wind up having to pay for it somehow. Where is that cost here, and if it had existed decades ago, what kinds of things would it have prevented from happening? What I mean to say is that while “a sequence of bytes” doesn’t give you much, it does at least provide flexibility. What have we bought with that flexibility?
@shalabh, if you’d like specifics on how this sort of thing is handled in various Xanadu systems, I can provide them.
Generally speaking, we have some an abstraction layer where ‘documents’ are composed out of underlying objects of some kind, where ‘documents’ are arbitrary virtual byte-streams assembled on the client side (usually piecemeal & lazily). Documents themselves look like bytestreams, except that the information about the original address of any given byte is accessible, and this lets us look up ‘formatting links’ or ‘overlays’ (which contain information equivalent to MIME for non-text but contain font, size, emphasis, or color information for text). Depending on the particular implementation, the underlying structure this abstracts from could be a set of conventional files addressed by conventional URLs (as with XanaduSpace, XanaSpace, OpenXanadu, and XanaduCambridge) or dot-separated numeric indexes for traversing a tree structure (as with Udanax Green / XU88 and Udanax Gold / XU92).
The general idea is that, for the sake of avoiding duplication, avoiding single failure points, enabling access control / monetization schemes, and enabling simple-to-implement but powerful tree versioning, we introduce an abstraction layer between underlying bytes & documents that is mostly a matter of ordering and reordering lists of fat pointers, but that users are not generally expected to be aware of the underlying abstraction unless we run up against one of a handful of points at which it intentionally leaks (for instance, if I do not have permission to view a piece of text, the span of bytes corresponding to it will be either blanked out or scrambled by a one-time pad).
I can provide more details if there’s interest. It’s tangential to the topic, but Xanadu is an example of an existing functioning system that breaks the UNIX philosophy of flat files in favor of more structure (while maintaining an opposition to hierarchical structure).
@enkiv2 - thank you, I’m quite interested in various aspects of Xanadu. I have been recently thinking about how hypermedia could be the primary model of interacting with a computer, with other features (such as programming languages, dynamic media) built on top; rather than hypermedia itself being a feature built on other stuff. The idea is to emphasize and represent the interconnected nature of all information and enable easy interlinking of objects. The vision is this: if I open my computer I don’t have files or apps, but just documents/objects. I can interact with these in place (they’re more than static documents) but also chase links from these to other objects. These links could be the history (older versions of same object), constituents (parts of the object), provenance (‘source objects’), etc. I’m wondering if there is a similar set of ‘link types’ defined in Xanadu? A related question is if Xanadu has something like the web URL - a user visible chunk that identifies a document. Finally, once Xanadu assembles the bytestream for a document, how does it know how to render it? Does it support multiple views of the same document?
In Xanadu, link types are freeform strings & there are optional client-side rules for dealing with specific type strings. If a string doesn’t have specific rules associated (i.e., if it’s not a ‘formatting link’) then it’s used to generate a highlight color (which the user can also override). This way, link types can be used for freeform tagging / categorization or (for links with two or more targets) as a kind of predicate & you get automatic color coding.
A link can also connect to an arbitrary document, of course, and every version of a document is stored permanently so revision history (stored cheaply because content isn’t duplicated) is available. But, generally there’s the expectation that you can view the original context of transclusions (which are not links but should be displayed like links when you ask) & from there, be able to navigate to all the contexts you have access to.
In XU88, there’s a namespace for assembled documents (the POOMfilade) versus chunks of text (the source enfilade). In XanaSpace, ordinary URLs are used & there’s the expectation that files ending in .xanadoc or .edl should be rendered (and files ending in .odl should have their links imported into the current context) by default. I was in favor of being able to write links against an assembled concatext by using the EDL’s address, so that it’s possible to do context-specific formatting, but I was never able to convince Ted.
In the absence of formatting links, we assemble as plain text & basically just simulate a text stream. (The expectation is to use word wrapping rather than hard newlines for ease of editing in XanaSpace, but otherwise, it’s just going to act like a text viewer.) The implementations I worked on didn’t get far enough to have implementations of the type-specific rendering rules or to have the plugin system for custom formatting links.
There was also the idea that you could hint type within a span declaration so that you could declare it during a normal link or transclusion & use type-specific start and length information – for instance, a section of an audio file in terms of start and end time, or an image in terms of bounding box, or a video in terms of a combination of both. We put off the details of this, because it was unclear how to ensure that only relevant portions of the file would be downloaded in these cases, and because in the easiest cases for XanaSpace we would be hitting http endpoints that support the w3c media fragments standard that expose this anyhow.
Yeah. All links basically go into an ‘overlay’ that’s like a modpack, and you can bring up a list of all links in various orders and turn them on and off for particular documents or source URLs or globally. So, you can have different sets of formatting rules for a document and switch between them. We had backend support for this but I’m not sure if any implementation actually exposed a UI for it, though this feature was certainly intended.
(A lot of criticisms of xanadu are based on the assumption that everybody has direct access to the same global namespace of links, which are on all the time. It makes more sense to see links as something like themes or mod packs or web fonts. A document can recommend a set of links & bring them in, but all resident & enabled links, regardless of source, will be applied to any document they can be applied to. This is great if you’re an academic in a field with a canon: if you spend a lot of time reading and writing commentaries on Canturbury Tales then your copy of Canterbury Tales will be a nexus of all sorts of overlapping links and annotations that accumulate as you get more commentaries. But, somebody can never impose links or formatting upon you in a way that you can’t disable.)
I also wonder if there was any ‘computation’ medium built or imagined on top of Xanadu which would expose the Xanadu structures as objects and let you build dynamic views? Or was it mostly about static text and images kind of mediums?
There was definitely the expectation of dynamic views. Details of how to do the plugin architecture were never really stable for mainline translit, but on the zigzag side we had various prototypes for this. Since in XanaSpace & XanaduSpace, we used zigzag for modeling the display, there was this idea that we might have zigzag-oriented code modifying the display information (sort of like javascript modifying a DOM or like writing applications in Display Postscript on NeWS). That said, the actual code never got written (and we had no fewer than 3 incompatible systems for scripting in zigzag that got implemented), & all implementations after XanaSpace didn’t use ZZOGL at all, so it’s all pretty abstract.
Interesting stuff. Thanks for taking the time to answer my questions!
Tangentially, I wonder if you’ve used or have opinions on roamresearch.com. I ask because it’s the only writing system where I found a form of transclusion as well as fine grained versioning. There’s no collaborative aspect though.
I’ve been looking at Roam as well. This isn’t pertinent to the article, but my biggest concern is that a third party company has access to my notes, and vendor lock-in to some degree. They do provide export options, but once I’ve accumulated years of interconnected notes, I wonder how functional that will be. I’d love an open-source alternative that can be self-hosted for these kinds of projects that deal with highly personal details.
Roam is also the only note-taking program I’ve come across that attempts to break free of the tyranny of linear progression, a paradigm that simply isn’t in accord with how the mind works, and one that I think has been highly limiting in the past. That’s why pen and paper have tended to win thus far - they allow you to capture the highly networked stream of consciousness in a form that is least distorted, as opposed to pretty much all existing text editors that force you to retrofit this networked stream of thoughts into a neat linear progression, with tremendous loss of insight.
I share your concerns. The offline, partial sync operation of Xanadu that enkiv2 described above seems quite useful.
Unix’s “bag of bytes” attitude towards files was a refreshing change from what came before. No longer did you have to presize your files. No longer were you restricted to fixed sized records (even for things we would call “text files”—image every line having to be the same size). No longer were you restricted from reserving space on the disk for your file. It even simplified access to files, you could read as much or as little as you wanted and not some fixed amount (ala CP/M or MS-DOS 1.0).
And there have been attempts to bring some order to this. Back in 1985, Electronic Arts (yes, the gaming company) developed IFF (Interchange File Format) and it caught on quite a bit on the Amiga. It comes close to what you want (blocks of data are self-delimiting so it’s easy to skip what you don’t understand) but it doesn’t come quite to the granularity you probably want for a self-describing data format. The only file type that can still trace its format back to IFF that is still in use is PNG (only they got it somewhat backwards, of course).
One criticism I have is that criticism itself is cheap—you aren’t the only one to rant about the limitations of . What’s lacking is an actual attempt to address these limitations. What’s stopping you from just doing it? Create files in your image and convince others to use it. Initially it’ll be on top of an existing implementation but it will be something to work with.
Apple’s AIFF uncompressed audio format is basically IFF, although it’s not used as much in this modern age of lossy audio compression.
Microsoft’s RIFF format is basically “little-endian IFF”, and is the basis of WAV and AVI formats. Also, Wikipedia tells me, Google’s WebP format is based on RIFF.
Oh, cool. I did not know that.
That is indeed the long term goal. However I don’t have a clear picture of a solution. Having more structured datastore instead of files sounds better but doesn’t significantly move the needle. I don’t think we should be designing the file structure independent of the system as a whole. The question is how meaning and information is represented across the system. Whether we have bytes or records, the deeper problems are how we associate them with meaningful views, while minimizing pre-shared information.
Forgive my ignorance, but that is the difference between the “link” you wrote about and Unix hard/symbolic links? I mean, as you have mentioned in the thought experiment, we could let the system to pre-define a special class of byte sequences to signal hard/symbolic links to other files (which are also byte sequences themselves, thus allowing nesting) so that applications do not need to invent new wheels. In this setting, when you attempt to read a file, the system will recursively replace such in-place links with the corresponding byte sequences, and I suppose this would grant files some hierarchical structure? It seems that all the above can be trivially built upon the existing hard/symbolic links scheme, so I am a little confused.
The main difference is that these embedded links would not be auto replaced by the OS. Rather the API itself would expose the structure - making it different from the current file reading API. So instead of something like
read() -> bytes
You’d have something like
read() -> List[bytes | link]
(This is not really well-designed but just shows the core difference.) In the first scheme, files are streams of bytes and that is the primary model for information storage. So if an editor opens a file, it just gets one stream of bytes and displays them. This is the current model.
In the thought experiment, when the editor opens a file, it does not get a stream of bytes to display, rather it gets a list of chunks and links to other files (also lists of chunks/links..). So it would be built to display and allow navigating this structured interconnected graph, possibly using a list view. Similarly when designing programming languages, you’d have this other option for representing links - instead of designing import path.to.file` syntax you could use the file link. You could also, for instance, design it to have one function per file and a module file that contains only links to each of the function files. Interestingly, this structure would ‘just work’ in any editor designed for this scheme.
To put it another way, in the thought experiment, the OS natively supports representing rich interconnected structures, while in the current model the OS supports a hierarchical namespace of disconnected bytes.
Can this be implemented on top of what we have today? Yes of course, but the interplay with other tools is at the file level, so it doesn’t provide the same benefit.
Hope this helps, happy to elaborate more.
That makes sense! Basically, there is a missing layer of abstraction on top of stream-of-bytes. How this layer should be defined seems hard for people to agree upon, though, as indicated by the diversity of link representations adopted by different file formats.
Yes exactly. Couple of things:
Even before files there were more limiting abstractions available, as spc476 wrote in another comment (https://lobste.rs/s/klvt8y/files_formats_byte_arrays#c_fthb0l)
Will just having an agreed upon graph structured abstraction be sufficient? I don’t think so - I think we’ll just slightly push the problem one level up. E.g. we might still worry about syncing information across machines, the format of bytes that are stored and so on. We need to think about how to represent information across the whole system, specially considering distribution and how we can minimize the pre-shared knowledge that is needed to transmit the information.