1. 15
  1.  

  2. 9

    Tech has a short memory lately, and I would like future implementors to learn not only the lessons of the web but the lessons of pre-web hypertext systems (which often solved problems that the web has yet to address).

    I do wish the author would have followed this with lessons from history. I found the requirements list interesting but I would likely give them more weight if they were tied to the specific lessons they were informed by.

    1. 5

      I may write a follow-up with historical information included. Unfortunately, most of these guidelines could be turned into complex polemics on their own (and I have written some of them)!

      The guidelines are heavily influenced by my work on Xanadu, and many are a distillation of ideas that are threaded through a lot of sort of unfocused rants by Ted, some of which are not even public. When I have a chance, I’ll do some archaeology and find proper references where possible. (Alternately, I may just write the polemics I am inclined to write on particular topics, with citations.) My “lessons” are pretty controversial (and some are controversial even within the ex-Xanadu crowd – such as the emphasis on peer-to-peer systems).

      Items #2-6 and #9-13 are things that were part of the Xanadu design since at least the early 80s (and in some cases, going back to the 60s), and are well-documented either in Ted’s criticisms of the web or in available design documents from Xanadu projects.

      Items #7 and #8 are controversial in Xanadu – they apply to Udanax Green (and probably Gold, though I’m not totally sure), but not to implementations done since 2006. My group maintained that forcing links to always apply to the original source text, as opposed to positions in the document, was a confusion of form and content of the same type as XML (which treats conceptual groupings that function mostly as formatting guidelines as part of content even as those groupings are mostly entangled with form) – particularly in the context of formatting links. (After all, a page break may be appropriate in the context of, say, a book, while a paper quoting that section of the book would not want to add a page break at that point.) Ted said that supporting the document-as-assembled as a first order object in transclusion and linking complicated the specification and complicated the implementation, and wasn’t strictly necessary anyhow (since you could just force that document to un-apply the offending formatting link).

      Item #14 is half-controversial. Because I support assembled documents as first-order object for the sake of transclusion, I also think that it’s justified to cache assembled documents in some cases. Very often, a derivative work becomes more popular than the origins of its component parts – and such a work might become very fragmented, if it is itself composed from similarly fragmented sources. The important thing is that an assembled document, when cached, can nevertheless be everted into the cached parts of its source documents, so that we still gain the benefits of caching when we go to load the source. (This is particularly useful when a single source has many popular derivatives with minimal overlap – say, Poor Richard’s Almanac, whose epigrams are quoted all over the place in extremely various forms, or in Marx’s work, different and mostly-non-overlapping pieces of which are very important to economists, sociologists, and bolsheviks.)

      #16 is true of both Xanadu and IPFS, probably independently. (It’s controversial elsewhere. Access control is a hard problem, particularly when you’re trying to get a variety of implementations in a peer to peer system with potentially-untrusted peers, and so I would rather depend upon crypto, which will break eventually in individual cases and expose stale data, than access control, which will be attacked directly and probably break even easier. However, I don’t plan to optimize this for secret data. I’d like to use it to encourage openness, and discourage people from hiding things on it at all.)

      #17 is from IPFS. Xanadu implementations have not largely been peer to peer – the business model has always been to charge for storage (though sometimes as a flagship node in a federated system). I feel like making these facilities available to people is more important than making a buck off them, so I prefer peer to peer (which after all requires less money up-front to set up).

      #18 is my own formulation of a rule that theoretically underlies both Xanadu and the w3m’s URL rules (and seems to also be an assumption underlying the design of HTTP, particularly with regard to the design of response codes). I’ve written about it in various places, as well. TL;DR version: server-side content variability breaks all of the important parts of hypertext, while client-side content generation is a poor and wasteful simulation of regular native app development. A hypertext system should not double as an application sandbox or code delivery system.

      #19 is part of post-2006 Xanadu design, which uses a single append-only file and calls it the “permascroll”. I feel like it should be integrated into the cache system (which is absolutely necessary but, when I took over the 2006 project in 2011, was not implemented or really planned for).

      #20 is a side effect of #17, and is derived from Usenet, although similar ideas exist in popular forum software, as browser extensions, and in Mastodon.

      1. 1

        Thank you for coming back and addressing this - I learn a lot from histories.

        1. 1

          No problem! I wish I could more easily link to records & stuff. The historical basis is a little weaker than I thought when I first began writing that essay – mostly coming down to “this is how Xanadu did it before TBL invented the web”.

    2. 7

      There are a number of issues with these ideas but there are two I want to draw attention to in specific.

      All byte spans are available to any user with a proper address. However, they may be encrypted, and access control can be performed via the distribution of keys for decrypting the content at particular permanent addresses.

      While perpetually tempting, security through encryption keys has the major drawback that it is non-revocable (you can’t remove access once it’s been granted). As a result, over time it inevitably fails open; the keys leak and more and more people have access until everyone does. This is a major drawback of any security system based only on knowledge of some secret; we’ve seen it with NFS filehandles and we’ve seen it with capabilities, among others. Useful security/access control systems must cope with secrets leaking and people changing their minds about who is allowed access. Otherwise you should leave all access control out and admit honestly that all content is (eventually) public, instead of tacitly misleading people.

      […] Any application that has downloaded a piece of content serves that content to peers.

      People will object to this, quite strongly and rightfully so. Part of the freedom of your machine belonging to you is the ability to choose what it does and does not do. Simply because you have looked at a piece of content does not mean that you want to use your resources to provide that content to other people.

      1. 1

        Any application that has downloaded a piece of content serves that content to peers.

        The other issue with this is what if the content is illegal? (classified government information, child abuse, leaked personal health records, etc.) There are some frameworks like Zeronet where you can chose to stop serving that content, and others like FreeNet where yo don’t even know if you’re serving that content. (These come with a speed vs anonymity trade-off of course).

        I do agree with the idea that any content you fetch, you should reserve by default, maybe with some type of blockchain voting system to pass information along to all the peers if some of the content might be questionable, giving the user a chance to delete it.

        1. 2

          Author of the original post here. My prototype uses IPFS, which uses (or plans to support, at least) distributed optional blocklists of particular hashes. This would be my model for blocking content. Anybody who doesn’t block what they’re asked to block becomes liable for hosting it.

      2. 3

        Interesting.

        No facility exists for removing content. However, sharable blacklists can be used to prevent particular hashes from being stored locally or served to peers. Takedowns and safe harbor provisions apply not to the service (which has no servers) but to individual users, who (if they choose not to apply those blacklists) are personally liable for whatever information they host.

        This is something I have given some thought to. I agree with things not being removable, however, who controls the blacklists? That’s an extraodinary level of power. Conversely, blacklists are likely to be reactive rather then proactive, and therefore it’s almost certain that at some point a user will end up hosting something that is illegal in one state or another - without even being aware of it. Which is also a problem.

        1. 6

          The key to making peer to peer work is groups. When everyone has to manage moderation, block lists, illegal content, and encryption by themselves, the overhead makes the network difficult if not impossible to use for most.

          If you base these decisions on groups, much of the overhead can be amortized such that the cost of using the network is not much more than using a centralized, managed network like Facebook. Like-minded groups (say /r/science and /r/chemistry could collaborate on this to further reduce the workload).

          You also get the benefit that TOFU is per group, not per individual. This greatly decreases the need for and importance of manual certificate verification

          1. 2

            That’s a very nice approach actually. Thanks for explaining!

          2. 3

            To be clear this is also the same thing that corporations deal with. Safe Harbor rules basically are what you would need here.

          3. 2

            A link has one or more targets, represented by a permanent address combined with an optional start offset and length (in bytes).

            In a UTF-8 world, shouldn’t this be characters rather than bytes?

            1. 1

              PNG and any binaries is not exactly UTF-8 (I assume you still want to access images via hyperlinks).

              1. 1

                True, but does it make sense to link to a byte position within an image or binary? In the latter case, perhaps — but might it not also make more sense to link to a character within the display version of the binary?

                1. 2

                  In my work with Xanadu, we supported bytes for all types of spans but also supported characters for text. Specs indicated we should also support (x,y) coordinate bounding boxes for images, time (in seconds, minutes, etc.) for audio, and bounding-box/time composite addresses for video. (Span format was shared between links and transclusions, and so this affected fetch and cache behavior.)

                  These specifications were not fully implemented by my project, and I’m unaware if any of the other simultaneous implementations being worked on actually supported them. Part of the reason for this is that we would like to only fetch and cache necessary parts of the target. This works great for text (so long as you’re using a protocol that lets you fetch only certain byte spans – we supported HTTP, which theoretically supports this). It’s a lot harder for compressed formats. Ultimately, we would either need to perform a bunch of round-trips fetching headers and things in order to store the minimum, or we would need to fetch large but not-necessarily-complete chunks (enough to identify frames). Either way would involve extremely format-dependent code, and we weren’t terribly comfortable with what amounts to a piecemeal rewrite of FFMPEG to expose strange internal stuff for our totally-unsupported special case. (We had enough of that as it is, with trying to convince OpenGL to render text on a texture in a way that was portable & fast enough to use as a text editor while using the modern pipeline!)

                  Ultimately, we needn’t have worried – we only very rarely ran into third party HTTP servers that let us request particular byte spans, so most of the time we had to download whole files anyway. (I was pushing for gopher and IPFS support – gopher because it lacked the overhead of HTTP, and IPFS because it actually guaranteed permanent addresses – but neither of these actually supports fetching arbitrary byte spans anyhow.)

                  As a result of working on that, I’m wary of demanding that future systems support anything so rich as semantically-meaningful units on compressed formats. (After all, transclusion and bidirectional links are easy by comparison, and yet the web doesn’t support them and neither do many other “hypertext” systems!)