hi there, i’m one of the authors of this book
the book was originally set to be published by a big tech publisher, but they backed out so we are releasing the book for free on the web and we are going to try to self-publish it
we are in the process of re-writing the book with a bit more of a conceptual bend than what the publisher wanted, and we are happy with how it is turning out. It is a work in progress, and there are incomplete sections and some repetitive areas, but the structure is about where we want it and much of the content is finished
the book covers hypermedia in general, then htmx for hypermedia-driven web applications, then Hyperview for hypermedia-driven mobile applications
hope people find it useful and interesting
This explanation of the usefulness of hx-boost
is something that had never occurred to me, and I’ve been thinking about (in addition to actually incrementally using it in new things) how HTMX improves my apps’ experience off and on for a few months now:
You might reasonably ask: what’s the advantage here? We are issuing an AJAX request and simply replacing the entire body.
Is that significantly different from just issuing a normal link request?
Yes, it is in fact different: with a boosted link, the browser is able to avoid any processing associated with the head tag. The head tag often contains many scripts and CSS file references. In the boosted scenario, it is not necessary to re-process those resources: the scripts and styles have already been processed and will continue to apply to the new content. This can often be a very easy way to speed up your hypermedia application.
That’s a good point, and might merit mention in the official documentation, because my initial reaction to hx-boost
was to ignore it because there didn’t seem to be much joy in using AJAX just to transfer the full page anyway. I was mostly thinking in terms of network transfer. I don’t think I’m alone.
The head tag often contains many scripts and CSS file references.
Does the browser re-process them if those resources are cached and not expired? I don’t think it is any different in that case…
Good question. I haven’t dug into that. Maybe @1cg has?
There’s a reason this sort of thing isn’t mainstream (anymore). “Async transparent” is a synonym for “free-threaded”. If you’re going to have mutable state you need some indicators of where discontinuities in state can happen (i.e., modified by non local code) to make reasoning about your code possible. The other way you can go is the Erlang design where suspending the local stack for IO doesn’t affect reasoning because there’s no local mutable state.
yes, I don’t think async-transparency is a good fit for general purpose programming, but it is nice for front end scripting, and given the already-single-threaded nature of JavaScript, it works pretty well and opens up some nice patterns with respect to event-driven programming that are painful in JavaScript.
a cool example is what we call “The Akşimşek Gambit”, a bit of code that Deniz Akşimşek wrote to make a div draggable:
https://twitter.com/htmx_org/status/1373987721354506243
you can see it in action if you go to the (pretty spartan) demo page and select “Drag”:
https://hyperscript.org/playground/
the crux of the algorithm is this:
repeat until event pointerup from document
wait for pointermove(pageX, pageY) or
pointerup(pageX, pageY) from document
add { left: ${pageX - xoff}px; top: ${pageY - yoff}px; }
end
where you loop until a pointer up from the document occurs, and then wait in the loop for either a move or pointer up.
Insane, but it works, and it’s pretty darned cool.
Which, come to think of it, is a pretty reasonable description of hyperscript…
My point is that once you add “async transparency” (aka “green threads” or “coroutines”) it’s no longer “single threaded”.
It might be less code for this one application. But I feel like everyone agreed that Angular-style directives have a very low ceiling, and eventually you just want the power of a programming language to generate your actual markup. This way you’re not stuck wading through tons of documentation to figure out the arcane attribute syntax for some hyper-specific use case.
A concrete example of this is polling. Here is the HTMX documentation for polling: https://htmx.org/docs/#polling.
You have to find the specific attribute that you need (hx-trigger
) and then you have to work with the inner scheduling DSL for when to poll (every 2 seconds
). What if I want to poll every odd hour? What if I want to poll only on Mondays?
I’m sure there’s a place where this is useful, maybe for small applications or side projects. It probably is slightly quicker initially. But I’m not alone in abandoning this style of UI development a long time ago.
For more dynamic polling, I would write a small script to trigger an event, and then listen for that event.
The crux of htmx is hypermedia: exchanging hypermedia with a server and using hypertext as the engine of application state. It will work well for some types of applications, and size is less important than the flavor of the application: gmail would be great as a hypermedia application, google sheets would not. The core question is: can the UX I desire be achieved with hypermedia and my hope is that, with htmx, the answer to that question is “yes” for a larger subset of applications.
I definitely agree that there are appropriate hypermedia applications - for me, it’s static content like blogs. I always classified gmail as the exact kind of application that is not good for hypermedia. In fact, Gmail is the complete poster child for AJAX and SPAs. It’s always referenced as one of the first applications that used the SPA style, and I think that makes total sense.
I think you could build a very nice email client entirely in htmx, using hypermedia clients. I like to mention GMail because it was done as a SPA and, in my opinion, didn’t need to be. Certainly what react was created for, to append a comment to a comment list, didn’t require an SPA. Something like Sheets or Maps, on the other hand, would be hard to implement using the hypermedia network architecture.
In general my hope is to expand the class of applications, dynamism & the general UX level that people feel they can achieve with hypermedia:
It’s a mindset change (reversion?), to be sure, but for many applications it can simplify things quite a bit.
This is the exact kind of considerations I’m trying to address with my talk. In the demo part I show a faceted search engine, with a special facet option that updates its results according to a change on server state which is the consequence of a user action elsewhere on the page. Pretty far from “static content”, “small application” or “side project”, don’t you think? It’s the same level of complexity that what you can find on Gmail UI.
Yeah, I tend to prefer Alpine.js because even though it is inspired by Angular/Vue, at the end of the day, it’s just JavaScript. If you wanted to make a polling directive in Alpine, you would just write the normal window.setInterval
code for it, instead of going into some “inner platform” language.
Perhaps a more important change is that the entire team became “full-stack” developers, instead of having a hard front-end/back-end split. That’s huge, in my opinion
agree entirely, splitting on features rather than “which side of the wire” is a huge boost in productivity
One challenge I’ve found in that front is that if you want good frontend developers it’s very hard to find those that are also willing to deal with backend development, or having much of any skill there. There are many good backend developers that are alright at dealing with frontend, but above a certain quality bar it’s basically impossible.
I do think it’s great to be able to have people who can really handle the entire stack, but there are issues with mandating fullstack-ness, so you gotta be a bit careful there. This is “solvable” by pairing people up but then you get into another can of worms based on who you have on the team.
Is that split really front-end vs back-end or infrastructure vs UI? I’ve not done serious web development, but my guess would be that it’s hard to find good interaction specialists but much easier to find people who can write code that runs on either end of a network connection. Are there teams where ‘UI designer’ is a separate role but the engineers work on the whole of the application stack?
Are there teams where ‘UI designer’ is a separate role but the engineers work on the whole of the application stack?
So I’ve worked on a team that ended up almost there. There was one “pure” UI designer, and another who was UI design but liked coding. There were a lot of growing pains because he couldn’t properly prototype stuff due to our backend being hard to run for people not used to the terminal.
They both could get work done provided the right tools! But the tooling we had was made (essentially) for backend engineers, so it was a huge amount of friction. That and there was just a different kind of workflow.
But yeah, the overall lesson was that there were people able to do the work we really were missing from the team (remember, the other engineers could write code in both domains! Just the frontend code wasn’t great), but those people could not really work on the team without us adjusting expectations about infra-related demands
How does HTMX make full stack dev any easier? You still need to know about the browser, HTML, forms, etc.
Yes, but we have strong evidence from the old days, pre-SPA frameworks, that server-rendered templates are manageable for developers who also do back end work.
It was only when SPAs and their attendant complexity came along that the split between front-end and back-end became pronounced.
I haven’t observed that at all. Real frontend specialty, like advanced CSS and UX stuff, has always been taken care of by frontend specialists. Everything in between, pretty much anyone can do.
The evidence being that full stack development was a relatively common thing pre-SPA era.
Not saying that there won’t always be designers involved, especially once you start talking about externally facing applications. But htmx puts full stack back on the table as a potential option in a way that SPA libraries and their attendant complexity do not.
I guess I’m a little confused as this site looks like some kind of article aggregator.
Why would you have used react for a site like this in the first place?
As they say in the talk, when they were beginning they were told they had to use react for their application to be “modern”. Sadly, many people think that that’s true, and don’t realize that there are hypermedia-oriented options like htmx, unpoly and hotwire that can give you more interactivity within the hypermedia model. So they end up going react, because everyone else is, and that’s what HR hires for.
Did y’all evaluate the different hypermedia oriented frameworks before choosing htmx? Just curious if there are significant differences.
I’m not the speaker, I’m the creator of htmx, so not an unbiased source. :)
David mentions unpoloy and hotwire, two other excellent hypermedia oriented options in his talk, and he uses Stimulus for some javascript (rather than my own hobby horse, https://hyperscript.org) but he didn’t say why he picked htmx.
Generally, I would say:
We chose htmx because of its simplicity (data-attribute driven, very few but very generic features). We evaluated:
Hi there, author of the talk here, sorry for the delay.
Our application is not an article aggregator, it’s much more complex than that: it presents future laws being discussed in French parliament. The right part is the text that is being discussed, and on the left you have the amendments filed by parliamentarians, and which aim to change the text.
But still, you’re right: “Why would you have used react for a site like this in the first place?” is precisely the question I asked when I discovered the modern tools of the hypermedia approach. But not because our application is simple: because the hypermedia approach can achieve a lot more than what most people think. It’s not just for “article aggregators”, “student projects”, “quick prototyping”, “small personal websites” and simple CRUDs. All day long I use professional tools that would benefit the hypermedia approach: Gmail, Sentry, AWS console, and others…
And this is what my talk is about: breaking the FUD spread by some people about what is doable with “old-school” web apps (web pages) and what is not doable with that approach, thus requiring the whole API+SPA stack.
TLDR:
Took 2 months (21K LoC, mostly JavaScript)
No reduction in user experience
Reduced LoC by 67% (21,500 LoC to 7200 LoC)
They increased python by 140% (500 LoC to 1200 LoC), good if you prefer python to JS
Reduced JS dependencies by 96% (255 to 9)
Reduced web build time by 88% (40s to 5)
First load time-to-interactive was reduced by 50-60% (from 2-6 seconds to 1-2 seconds)
Much larger data sets were possible than react could handle
Memory usage was reduced by 46% (75MB to 45MB)
These are spectacular numbers that reflect that the application in question is highly amenable to the hypermedia approach.
I wouldn’t expect everyone to see this level of improvement, but at least some web apps would.
Build time might be a bit of a red herring, as I bet it was webpack. esbuild
is absolutely blazing fast and gets rid of that whole axis.
I do think that measuring what happens in practice rather than theoretical minimums is good, and like up until recently webpack was unfortunately “the thing”, but at least on that front there are ways to make improvements without having to manipulate the universe
Yes clearly we switched from Webpack to esbuild. But this switch was made possible by having a lot less complicated JS stack.
For blogs yes, but if you’re building an Application (A in SPA), well. JS and all this fancy bi-directional data building, react(or whatever) components, state/cache is essential.
I think it depends. There are lots of apps for which an SPA framework is overkill, and lots of places within an app for which an SPA framework might be better where, none the less, that SPA framework is overkill.
Something like gmail or twitter could be built in htmx, because those are mostly text-and-image type websites amenable to what Roy Fielding called “coarse grain hypermedia exchanges”. Something like google sheets, not so much, because there are a lot of inter-UI dependencies that are complex, and you don’t want to introduce network calls on every recalculation of your sheet. On the other hand, the settings page of google sheets might be more simply done in htmx, which can save your complexity budget for the parts of the app that require a more complex solution.
I don’t know much about htmx, but it seems that if it really is just hypertext, it could be used to in conjunction with SPA frameworks to reduce their “size”.
Yep, you can use htmx for the more straight forward areas of the application that are amenable to “coarse grain hypermedia exchanges” to use Roy Fielding’s phrase, and then use an SPA framework where that level of interactivity is needed.
hello, not sure what the culture is here since i’m new, but i’m the creator of htmx and I’m happy to answer questions
Examples of sites created with this? I did look, but I didn’t I need anything but the little demos.
I built leaddyno.com with the predecessor, intercooler.js
https://commspace.co.za is a sponsor and is built with it.
We picked up JetBrains as a sponsor this year, which I hope is a pretty good vote of confidence…
I went looking for an IRC channel for htmx; found none. #htmx
on Libera is utterly empty. Clicking around, I eventually find the Talk page which tells me your community uses Discord. No thought of even having an IRC-Discord bridge?
We’ve talked about it. I’m boomer-tier when it comes to this sort of stuff, even getting a discord set up was an monumental technical achievement for me.
There is another guy who works w/ me on hyperscript mainly and his long term goal is to get us moved over to matrix.
“An” SPA implies the author doesn’t say spa but es-pee-ay. I was not aware some folks treated the word not as an acronym but as an initialism.
Exactly. How would you verbally compare and contrast an MPA vs an SPA in the same sentence? Spas and em-pee-aye’s? Spas and m-pas? It doesn’t make sense!
Never in my life have I used the term MPA. :-)
I think I would call the opposite (server side) template rendered sites.
I typically pronounce it as an initialism, but I have to admit when I wrote that title it looked funny and I switched back and forth a bunch of times. I eventually just went with the way I would say it in conversation.
In cases like that, rewrite: The SPA Alternative. It also makes your thing sound like it’s the one true solution.
Why should only
<a>
and<form>
be able to make HTTP requests?Why should only click & submit events trigger them?
Why should only GET & POST methods be available?
Why should you only be able to replace the entire screen?
I was reading the article and wanted to check also the homepage of the project, so I clicked on the logo with middle button of my mouse. Nothing happened… despite the fact that the cursor (hand) seemed that it should behave like a standard link. But it might be just a random bug not a design flaw…
I clicked on the logo with middle button of my mouse. Nothing happened…
Out of interest I went to see what happens for me, normal click loads the homepage as I’d expect. cmd-click opened the homepage in the current tab, not in a new tab. That’s not behaving like an <a>
link, which makes this worse than styling/moving an <a>
around for me.
Downside of replicating native behaviour, all the “weird edge cases” are actually used by people who then notice when they’re missing.
Looks like that was some older junky code on the logo, it was switched over to a straight boosted link and should work as you expect now.
Using Firefox 102.0.1 on Linux, left click goes to the homepage, middle click opens in new tab, ctrl+left click opens in new tab. Not sure what happened with your interaction.
It looks that they fixed that pretty fast:
https://github.com/bigskysoftware/htmx/commit/023b7cb2a480045c10d354455c038acc93545263
Does anyone have an example of discoverability of urls working?
I’ve implemented an (JSON over HTTP) API which tried to be HATEOAS in that a client was expected to discover URLs from a root, which it could then manipulate to achieve certain goals.
I think we had two developer teams using the API (one local, one remote) and the remote one just hardcoded the URL fragments they discovered so they didn’t need to start a session and walk down to discover the correct URLs. The idea was we could change implementation and API versions and clients would handle it, but obviously this broke it.
In hindsight, latency is king and I don’t blame the remote devs for doing that, I’m just curious if anyone ever got this to work (and how)? I guess returning fully randomised URLs in the discovery phase is one way….
The Web is intended to be an Internet-scale distributed hypermedia system, which means considerably more than just geographical dispersion. The Internet is about interconnecting information networks across organizational boundaries. Suppliers of information services must be able to cope with the demands of anarchic scalability and the independent deployment of software components. – intro to Fielding’s thesis
This is the idea of Anarchic Scalability.
If the client and server belong to different organizations, the server cannot force the client to upgrade, nor can the client force the server not to.
You cannot force a client to use your service, you can’t stop a thousand from deciding to use your service… and there is always another web site one click away…
So how are you going to design a system that allows the server to upgrade and change… without “flag days” arranging for all clients to upgrade at the same time as the server?
Conversely, if you all part of the same organization… isn’t there something simpler than REST you can do? The downside the boss of your team and their team is usually so far from the technical side… they can’t understand the problem.
I kinda agree that JSON is not a native hypermedia but so is not HTML. Have you ever tried to encode any method other than GET of POST in pure HTML? Well, you can’t. So it turns out HTML is not a fully realized hypermedia format either. The OP links to another 7 posts trying to convince that HTML is the one true REST format and neglects to mention that you can only encode half of the method semantics.
The author insists that the client needs all sorts of special knowledge to interpret JSON payloads but HTML is somehow natively understood. Well, it’s not if the client is not a browser. The client can very well understand some JSON with a schema that supports lining and method encoding, and whatever. And that API is very much RESTful even though not every client can use it.
HTML is a native hypermedia in that it has native hypermedia controls: links and form. JSON does not. You can impose hypermedia controls on top of JSON, but that hasn’t been as popular as people expected.
I agree entirely that HTML is a limited hypermedia, and, in particular, that it is silly that it doesn’t support the full gamut of HTTP actions. This one of the four limitations of HTML that htmx is specifically designed to fix (from https://htmx.org/):
I get what htmx is trying to achieve. However, it doesn’t help with the REST narrative OP presents. It tries to convince us that REST is good and everyone is wrong about it (which is fine). But it also tries to convince us that HTML is the way while also being a thing on top of HTML to make it actually fulfil its role in REST.
Let’s assume for the sake of the argument that htmx is the actual hepermedia format the REST requires. Does it make REST useful? To actually use the REST API we need a very special kind of agent: a conforming web browser with scripting enabled.
Given that constraint it’s no wonder no one actually implements REST APIs. We have whole lot of clients that are not browsers: mobile clients that implement native UI and IOT devices that can not run a browser. And if we need to build an API for those that is not REST (by the OP’s definition) anyway then why bother building a separate REST API for the browser?
I like the idea of REST. I believe it’s ideas are valuable and can guide API design. Insistence on a particular hypermedia format (HTML but, I guess, meaning htmx) is misguided.
in case anyone wants to contribute, the hypermedia systems repo is now public:
https://github.com/bigskysoftware/hypermedia-systems
the content of the book is licensed under Creative Commons BY-NC-SA 4.0
https://creativecommons.org/licenses/by-nc-sa/4.0/
all other content is licensed under 0-Clause BSD:
https://opensource.org/licenses/0BSD