1. 25
  1.  

  2. 10

    I would be happy if there was a way to avoid two things without breaking many web pages:

    • Web pages eating all my CPU and battery (usually for no benefit to me - it’d be one thing for a 3D game I was playing to do this, but it’s usually ads)
    • Tracking

    Removing Javascript would likely vastly improve, if not completely fix, both issues.

    1. 4

      It wouldn’t completely fix the tracking problem. Tracking is possible the minute you add the referrer header and third-party images (in the form of the old-fashioned tracking pixel).

      Third-party images without the referrer header would have been a disaster of a different kind. And while eliminating third-party images would have been a pretty harmless, it would have motivated a different (and likely more complex) method for having separate image CDNs and page hosting.

      1. 3

        Seems like third-party anything can and will be abused for tracking, so we’ll need to only allow first-party resources in addition to removing Javascript. I could live with that, but I’m guessing it’d be a tough sell :]

        1. 6

          There’s still plenty of information in the actual 1st-party request; in a hypothetical scenario where 3rd-party resource loading is prevented for a significant number of the population, trackers will just switch to 1st-party requests for tracking, e.g. via a nginx module or whatnot.

          IMO it’s a mistake to view it as primary a technical problem: it’s a social/legal one.

          1. 4

            The problem with first-party ads, and the reason so few websites can get away with using them, is one of trust. It’s hard to prove that the website actually showed the ad.

          2. 2

            First-party can just proxy the tracking pixels from third-party.

            Tracking isn’t really a solveable problem. It is like going to somebody’s house, saying “tea” when he offered a drink, and expecting that he won’t remember it. If you emit information, then you cannot hope to stop others from recording those information i.e. tracking.

      2. 6

        I think there are two aspects to this. Below, I will use now old-fashioned word RIA(Rich Internet Application) to refer to “mutated application runtime”, its functionality, not its implementation.

        Replying to “HTML, which started as document markup, should never have grown into RIA”, the author basically explains RIA-less HTML wouldn’t be much simpler, nor would it be much more efficient. In other words, the post is entirely about document, not RIA.

        In my experience, when the argument is brought up, it is usually about RIA, not document. HTML-less RIA, not RIA-less HTML. HTML-less RIA, legacy free RIA implementation designed from scratch for RIA need, could be simpler and more efficient. There is also no backward compatibility need here. Writing a cross platform application runtime is a big task, so it isn’t easy, but the task is not helped by need to serve document markup legacy and web compatibility burden.

        Flutter is a try to create HTML-less RIA. I doubt the author thinks Flutter does not make sense; it clearly does. Now, once we have HTML-less RIA, RIA-less HTML could save time specifying and implementing endless stream of APIs necessary for RIA, and focus on its already awesome styling and layout and rendering engine of document. I agree it wouldn’t be much simpler nor much more efficient, but it would also greatly help. This is why I feel the argument and the reply in the post is talking past each other.

        1. 4

          I think what the author is doing is responding to the many people out there on the Internet who treat this as a throwaway line (on HN for instance). I read most of them as asking for a RIA-less-HTML, and I think this is a good criticism of that idea.

          I don’t know which of us is right about what people who use this line are asking for.

          1. 2

            You mentioned HN, so let’s try some empiricism. This article just hit HN front page. https://news.ycombinator.com/item?id=23599734 is a typical response. Note that it is entirely about RIA and whether DOM is a good basis for RIA, not about document, as I predicted.

          2. 1

            HTML-less RIA, legacy free RIA implementation designed from scratch for RIA need, could be simpler and more efficient.

            I think our GUI builders like VB6 and Lazaurus already implied this by their feature vs footprint compared to web offerings. For more apples to apples, I also like to bring up Sciter because it’s so much more efficient than Electron etc. We could definitely do better than HTML and web browsers if just wanting to render content efficiently. Its dominance is a legacy and/or ecosystem effect, not technical superiority, at this point.

            Edit to add: I’ll add that OS’s like MenuetOS fit a whole system in a floppy. Nobody’s building RIA’s like that for various reasons, esp productivity. It does imply our platforms or supporting libraries that the RIA’s run on could be much leaner. I’m thinking something like a GUI builder combined with a runtime lean like MenuetOS.

          3. 5

            Browsers never stopped supporting plain old static HTML. Anyone can choose to use it, without cramming a ton of JavaScript and tracking on top of it.

            The biggest problem is incentives.

            In commercial web development fast and lean code is far down the priority list below monetization, development cost and agility, business intelligence, and standing out from competition. We don’t even efficiently use technologies we have, because “tag managers” are an easier sell than proper usage of HTML.

            It’s impractical for mainstream users to switch to “static” browsers that don’t render existing pages properly. At best it’d be something they may endure rather than something they enthusiastically adopt. We already have browsers that support both static and crapware-laden pages.

            Google AMP is the closest we’ve got to a static HTML subset working in practice, thanks to the incentive of being above the fold in mobile search results. But even this is only enough to make some publishers adopt it, and when they do, they always have the AMP version only as a second-rate copy alongside their primary crap-laden one.

            1. 6

              To be honest google AMP is not any better. It’s just a vendor lock-in to google, enforced by the biggest monopoly currently on the market.

              1. 3

                Yeah, AMP is bunch of shady things. But the point is that developers don’t want simpler markup, even when Google is pushing them hard to use one. Developers slap <amp-iframe> as soon as they can to escape the limitations.

                Even ignoring technical sins of AMP, it’s a mixed bag from user perspective too. Dumbed-down markup works for content that could be a tweet, but publishers force clicking through to the real site for anything non-trivial.

            2. 4

              Wouldn’t it make more sense to have some kind of HTTP header and/or meta tag that turns off javascript, cookies and maybe selected parts of css?

              If we could get browser vendors to treat that a bit like the https padlock indicators, some kind of visual indicator that this is “tracking free”

              Link tracking will be a harder nut to crack. First we turn off redirects. Only direct links to resources. Then we make a cryptographic proof of the contents of a page - something a bit fuzzy like image watermarking. Finally we demand that site owners publish some kind of list of proofs so we can verify the page is not being individually tailored to the current user.

              1. 11

                The CSP header already allows this to an extent. You can just add script-src none and no JavaScript can run on your web page.

                1. 1

                  very true. not visible to the user though!

                2. 5

                  Browsers already render both text/html and application/pdf, and hyperlinking works. There is no technical barrier to add, say, text/markdown into mix. Or application/ria (see below), for that matter. We could start by disabling everything which already requires permission, that is, audio/video capture, location, notification, etc. Since application/ria would be compat hazard, it probably should continue to be text/html, and what-ideally-should-be-text/html would be something like text/html-without-ria. This clearly works. The question is one of market, that is, whether there is enough demand for this.

                  1. 5

                    Someone probably should implement this as, say, Firefox extension. PDF rendering in Firefox is already done with PDF.js. Do the exact same thing for Markdown by: take GitHub-compatible JS Markdown implementation with GitHub’s default styling. Have “prefer Markdown” preference. When preference is set, send Accept: text/markdown, text/html. Using normal HTTP content negotiation, if server has text/markdown version and sends it, it is rendered just like PDF. Otherwise it works the same, etc. Before server supports arrive, the extension probably could intercept well known URLs and replace content with Markdown, for, say Discourse forums. Sounds like an interesting side project to try.

                    1. 8

                      Browsers already render both text/html and application/pdf, and hyperlinking works. There is no technical barrier to add, say, text/markdown into mix.

                      Someone probably should implement this as, say, Firefox extension.

                      Historical note: this is how Konqueror (the KDE browser) started. Konqueror was not meant be a browser, but a universal document viewer. Documents would flow though a transport protocol (implemented by a KIO library) and be interpreted by the appropriate component (called KParts) (See https://docs.kde.org/trunk5/en/applications/konqueror/introduction.html)

                      In the end Konqueror focused on being mostly a browser, or an ad-hoc shell around KIO::HTTP and KHTML (the parent of WebKit) and Okular (the app + the KPart) took care of all main “document formats” (PDFs, DejaVu, etc).

                      1. 2

                        Not saying it’s a bad idea, but there are important details to consider. E.g. you’d need to agree on which flavor of Markdown to use, there are… many.

                          1. 2

                            Eh, that’s why I specified GitHub flavor?

                            1. 1

                              Oops, my brain seems to have skipped that part when I read your comment, sorry.

                              The “variant” addition in RFC 7763 linked by spc476 to indicate which of the various Markdowns you’ve used when writing the content seems like a good idea. No need to make Github the owner of the specification, IMHO.

                            2. 1

                              What’s wrong with Standard Markdown?

                          2. 2

                            markdown

                            Markdown is a superset of HTML. I’ve seen this notion put forward a few times (e.g., in this thread, which prompted me to submit this article), so it seems like this is a common misconception.

                          3. 4

                            Why would web authors use it? I can imagine some small reasons (a hosting site might mandate static pages only), but they seem niche.

                            Or is your hope that users will configure their browsers to reject pages that don’t have the header? There are already significant improvements on the tracking/advertising/bloat front when you block javascript, but users overwhelmingly don’t do it, because they’d rather have the functionality.

                            1. 2

                              I think the idea is that it is a way for web authors to verifiably prove to users that the content is tracking free. Markdown renderer would be tracking free unless buggy. (It would be a XSS bug.) The difference with noscript is that script-y sites still transparently work.

                              In the invisioned implementation, like HTTPS sites getting padlock, document-only sites will get cute document icon to generate warm fuzzy feeling to users. If icon is as visible as padlock, I think many web authors will use it if it is in fact a document and it can be easily done.

                              Note that Markdown renderer could still use JavaScript to provide interactive features: say collapsible sections. It is okay because JavaScript comes from browser, which is a trusted source.

                            2. 3

                              Another HTTP header that maybe some browsers will support shoddily, and the rest will ignore?

                              1. 2

                                I found HTTP Accept header to be well supported by all current relevant softwares. That’s why I think separate MIME type is the way to go.

                              2. 2

                                I think link tracking is essentially impossible to avoid, as are redirects. The web already has a huge problem with dead links and redirects at least make it possible to maintain more of the web over time.

                                1. 2
                                2. 4

                                  What I’d really like is for there to be a real push to provide plain text or similar alternatives to JS-heavy pages. Something like AMP++, in the same way that you can email someone a multipart email with an HTML component and a plain text component.

                                  I just want to use w3m. The only thing I use X for is a web browser. If I could open a different TTY on each monitor and browse the web easily in w3m I wouldn’t need X at all. I can watch videos using mpv -vo=tct (or -vo=drm if I’m feeling fancy).

                                  1. 3

                                    “The presentational concerns for documents are different from application rendering” is incorrect; there’s a lot of overlap, and there is a continuum between static documents and interactive applications

                                    I suspect the proposals this article is responding to are born of utter frustration with the number of things which should be anchored firmly at the static document end of the scale, but which drift aimlessly up the continuum simply because they can.

                                    Blog posts. News articles. Documentation…

                                    If these documents are written using static, semantic markup, then there are so many things which become trivial (i.e. within reach of an individual writing software in their spare time, rather than restricted to companies with thousands of employees):

                                    • Reader mode, or other “native-styled” presentation
                                    • Screen readers
                                    • Appropriate display on different devices, such as small phone screens, e-readers, etc.

                                    At the moment, there’s a significant chance that a simple implementation of one of these will end up with tons of garbage (menus, sidebars, footers, elements that should be hidden by javascript, and so on), and might not even manage to find the content (due to things being loaded at runtime, fancy embedded widgets for code or images, and so on).

                                    No-one has mentioned them yet, but I think the renewed interest in gopher and the creation of project gemini are related to this desire to share content in a way which is straightforward and light on processor and network usage (IIRC, a gemini document has to be downloaded with a single request, and should be renderable in a single pass).

                                    1. 6

                                      I suspect the proposals this article is responding to are born of utter frustration with the number of things which should be anchored firmly at the static document end of the scale, but which drift aimlessly up the continuum simply because they can.

                                      Blog posts. News articles. Documentation…

                                      I disagree. Some of the drift is aimless, but often people have very good reasons to move up the continuum. At the extreme you have things like pudding.cool and Bartosz Ciechanowski, which cram in dynamic power to elegantly explain complicated topics. But even “regular” documentation can really benefit from small amounts of JavaScript. One thing I want to do with Alloydocs is add a toggle that hides all advanced sections and asides from the reader, so that beginners aren’t distracted by stuff above their level. I don’t see a way of doing that without JS.

                                      1. 3

                                        In addition to folding, another JS-enabled feature I find valuable for documentation is client side full text search. Rust documentation has this and it is great for offline use as well as saving round trip to server.

                                        1. 1

                                          Why would client side full text search need to involve javascript!? Just serve the client the full text and they can search it themselves.

                                          1. 1

                                            I am not aware of any UA providing such feature? Sure you can search in a single document, but not multiple documents. UA provides grep, not lucene. Good implementation in JS can provide lucene.

                                            In addition, JS can do custom ranking, say preferring type over method. Even if UA gets multiple documents full text search, it can’t do custom ranking with plain text: it is necessarily domain specific.

                                            1. 1

                                              A single large document can perform very well, so the single vs multiple document argument doesn’t really hold much weight IMO.

                                              Being able to search for a kind of thing is harder; you can make it work with the right formatting (eg make the heading “foobar (Type)” so it can be grepped.

                                              It’s pretty rare to have a small enough corpus that you would send it all but a big enough one to need custom search. I’ll concede that it’s a real niche though.

                                        2. 1

                                          One thing I want to do with Alloydocs is add a toggle that hides all advanced sections and asides from the reader, so that beginners aren’t distracted by stuff above their level. I don’t see a way of doing that without JS

                                          I mean in theory, if your markup language allowed you to accurately label content for what it is (extra detail than can be hidden by default. I believe there’s actually a <details> tag in html which should do what you want), then there’s no reason the browser can’t handle it sensibly without resorting to javascript.

                                          You see this is exactly the sort of thing which causes problems for any consumer of content that isn’t a huge well-funded browser engine: If I try to download your documentation and render it as an epub to read on my e-reader (because reading long documentation on an e-ink screen is more pleasant), then I will probably end up either with all the advanced sections left in (because I didn’t run your custom javascript to hide it), or all the advanced sections left out (because I didn’t run your custom javascript to lazy load it). If I’m exceptionally unlucky, I’ll end up with all the extra details, but dumped out of context at the end of the article (because I didn’t run your custom javascript to insert them in the right place).

                                          If you use something like <details> instead, it’s trivial for my downloader to make a sensible decision to either hide it by default (if the display supports interaction) or show it (if the display doesn’t support interaction), but maybe style it so that it’s clear that it’s extra detail.

                                          1. 2

                                            I mean in theory, if your markup language allowed you to accurately label content for what it is (extra detail than can be hidden by default. I believe there’s actually a tag in html which should do what you want), then there’s no reason the browser can’t handle it sensibly without resorting to javascript.

                                            It’s usually not “extra detail” though, so <details> is not appropriate. It can be an operator that’s logically grouped with other operators but for special roles, something that you only see in legacy scripts, things like that. The sections are marked with css classes, though, so if you can configure your browser you can handle it sensibly. For everybody else, they have the affordance of having a button they can click which triggers javascript.

                                            If I try to download your documentation and render it as an epub to read on my e-reader (because reading long documentation on an e-ink screen is more pleasant), then I will probably end up either with all the advanced sections left in (because I didn’t run your custom javascript to hide it), or all the advanced sections left out (because I didn’t run your custom javascript to lazy load it). If I’m exceptionally unlucky, I’ll end up with all the extra details, but dumped out of context at the end of the article (because I didn’t run your custom javascript to insert them in the right place).

                                            There’s a third option here that’s much better than the other two or making you convert html to epub. I’m writing documentation in rST, which supports multiple compilation targets. I can separately produce epubs with and without the advanced topics and let you choose which one to download. You get the benefits of streamlined epubs, web readers get the benefits of hiding advanced sections.

                                            1. 2

                                              It’s usually not “extra detail” though, so is not appropriate. It can be an operator that’s logically grouped with other operators but for special roles, something that you only see in legacy scripts, things like that.

                                              <details> can have a <summary> tag, which allows you to hint at what sort of hidden info is there.

                                              Isn’t there a way to implement it in a “progressive” way, so that users without javascript see <details> tags, and your javascript builds on that to implement the style or behaviour you want when javascript is available?

                                              The sections are marked with css classes

                                              The trouble with css classes is that they are generally specific to the document/publisher (not standardised), so no matter how well organised your css classes are, it is not possible to write a generic tool which handles all documents well.

                                              1. 1

                                                Isn’t there a way to implement it in a “progressive” way, so that users without javascript see tags, and your javascript builds on that to implement the style or behaviour you want when javascript is available?

                                                Yeah, the current version is that without javascript, you see everything. The goal is to make it so that people can also hide stuff. Using Javascript to augment, even if it’s not necessary.

                                      2. 3

                                        There are incorrect factual assumptions here. “The presentational concerns for documents are different from application rendering” is incorrect; there’s a lot of overlap, and there is a continuum between static documents and interactive applications, with use-cases all the way along that continuum. “Something that doesn’t allow for stupid custom UI or behavioural tracking. Just text, images, videos, and links” is contradictory; embedded images, videos and links allow for lots of tracking and custom UI.

                                        This is a slippery slope fallacy. Yes there is a continuum, but first, there already is a bright line (namely Web Permissions API). If it requires permission (audio/video capture, location, notification), it is not a document, simple and easy. Saying there is a continuum, even if true, is not an answer when it is used to justify clearly not continuous stuffs.

                                        1. 3

                                          Gmail works in my browser without granting any particular permissions. So Gmail is a document and not an application?

                                          1. 3

                                            If you study logic, you know if A then B is not logically equivalent to if not A then not B. Here A is requiring permissions and B is not being document. Thanks.

                                            1. 1

                                              I understood the original article to be saying approximately “there is no bright line between pure documents and slightly-application-like documents”. When you said “there already is a bright line” I assumed you were talking about the same thing as the original article, but I can see now that you were talking about a bright line between pure applications and slightly-document-like applications.

                                              I agree that a “document” that requires permissions is very definitely an application, but I don’t see how that helps define which web-pages are very definitely documents.

                                              1. 2

                                                If you read my other reply, you will see that I consider defining definite application is more important than defining definite document, and yes I think this is the primary point of disagreement between me and the author.

                                          2. 3

                                            It’s not a fallacy–I don’t really see how you’d argue the opposite.

                                            The author is considering a proposal to define a subset of the web that serves for documents. He argues that the goal is poorly defined, not that implementing a subset will mean you slip into some bad choice.

                                            You responded by offering a well defined concept that doesn’t meet his goal. That’s not a refutation.

                                          3. 2

                                            HTML has occasionally headed slightly towards what I’d like to see in a language for document reading. The word ‘document’ is key here. I wrote my undergraduate dissertation in HTML 3-ish, using only elements like p, dd, h1, img etc.

                                            What I found missing at the time were two features:

                                            1. The ability to strictly validate my document. XHTML promised this - but wasn’t practical for browsers or web apps given their focus.

                                            2. The ability for me to more richly mark up my document. HTML 5 brings section, aside etc.

                                            I used to build my personal site by processing XHTML with some added ‘tags’ - I added them to a local copy of the XHTML schema and used code to turn them into standard XHTML with classes for CSS to mark up.

                                            I’d like to see a new SGML dialect, though another base would be fine, if we could find one that works, which heads in the direction of making a read-only, non-interactive, easy-to-lay-out (e.g. img sizes mandatory, fixed set of fonts available, no ‘font’ tag or CSS) language.

                                            To get half way there we already have the basis for this:

                                            • A language like XHTML with some ‘tags’ removed and some added. No more script but added section, for example. None of the unnecessary parts of XML either - e.g. no ‘CDATA’. I’d be tempted to make it very incompatible with XML to avoid the temptation to use an XML parser. For example making closing tags required but they must be <> or </> or something else different.

                                            • Not XML - a different doctype and mime type.

                                            • No CSS - rendering entirely up to the user-agent (which becomes a user agent again!)

                                            The one other issue I had while writing a (very) large document in HTML was that keeping track of it was tricky. I used indentation (by heading level) to help with this, but after scrolling, I was just 4 levels to the right with no way to see which section / heading I was in. Maybe editor support could help with this. I know outlining is great for larger documents. I don’t have a right answer - keeping zero indent doesn’t work well either as you still don’t know where you are.

                                            1. 2

                                              Previous discussions of gemini highlight a different approach to the web bloat problem: instead of trying to define a subset of html no one will agree on, just define a new, simple protocol that is limited in scope on purpose. The faq even has a section on why it’s not a subset of html.

                                              1. 2

                                                I think it is already exists – Markdown. And it is wildly used.

                                                1. 1

                                                  Still true, even if you mean “widely used”. Relevant either way.

                                                2. 1

                                                  Just use gopher with Markdown for formatting and it is actually static web which works fine when you have a client. Sadly i know of a single RSS client that support gopher:// urls. I will be thankful for anyone adding gopher support to their software, and it is not difficult as it is one of the simplest protocols.

                                                  1. 2

                                                    I think the problem is that if someone is willing to create a gopher feed, why wouldn’t they write minimal HTML instead? For example, do you see any benefits from danluu.com being available over gopher instead of HTTP/HTML?

                                                    1. 1

                                                      That’s like saying “static typing only rejects programs, why would you want one”. You could say it enables autocompletion, better performance, etc., but the primary value is rejection itself. Gopher forces minimal HTML-like content and that is valuable in itself.

                                                      1. 2

                                                        I suppose. I’m personally fine with umatrix and disabling JS by default, and would rather not have to use an entirely new program to read the sort of stuff I’m already reading online.

                                                      2. 1

                                                        Well, it’s not pure text, while in case of gopher - it is. Yes the difference is minuscule, at is a matter of couple of kilobytes but these things amass.

                                                      3. 1

                                                        I’m unfamiliar with Gopher servers, can they serve XML (which RSS is a subset of)?

                                                        1. 1

                                                          Well, it is a text document, just under /0/ form and it works fine with a client that is able to use gopher protocol, which I only know 1 of. You can also use a gopher proxy for that.

                                                          Example: gopher://dataswamp.org/0/~lich/musings.atom.xml

                                                          1. 1

                                                            OK I’d love to know more but nothing on that site works with my client (w3m)

                                                            Index of gopher://dataswamp.org/1/

                                                            [unsupported]Happy helping ☃ here: Sorry, your selector does not start with / or contains ‘..’. That’s illegal here.

                                                            1. 1

                                                              Does w3m support gopher? Lynx does and my favourite gopher client is sacc.

                                                              1. 1

                                                                Yeah according to Wikipedia’s list of gopher clients it should but obviously it is in error. lynx works better!

                                                      4. 1

                                                        The author here has really missed the point because they’re so interested in this hill they’ve decided that they’ll fight on.

                                                        The entire point is that there’s already a way that browsers do video, and that they believe that we don’t need to reinvent all the UIs for everything that the browser already does. In some ways, this is a fair argument. How often does a video site’s interface break because they decided to code their own play button?

                                                        The idea of JS-free documents is a good idea. It’s where the internet started. We live in a time where there is absolutely too much JS. Plain documents can be extremely expressive in modern HTML/CSS, and people tend to all agree that CSS is a mess of confusion and difficulty in terms of rendering.

                                                        Maybe something that is more strictly defined and provides the document as a simple tree without introducing wild things like custom UI scripts is better than HTML for some pages, and maybe this is worth looking into. Something like a decentralized version of Google’s AMP could be a similar middleground.