1. 23
  1. 8

    This explanation makes it seem like HATEOAS is not suitable for use as an RPC. It requires intelligence to know what a form means.

    A REST client needs little to no prior knowledge about how to interact with an application or server beyond a generic understanding of hypermedia.

    Okay, a client can push the buttons with a generic understanding of hypermedia. The hard part is not getting a list of buttons that can be pushed. The hard part is knowing what buttons should be pushed and why. That requires human intelligence.

    It’s optimizing for a problem developers don’t have. I’m not banging my head trying to figure out how to make an appropriate HTTP call. I’m banging my head trying to figure out which HTTP call will actually do the thing I want done.

    1. 6

      HATEOAS is definitely not meant to be used as an RPC though, RPC is the other end of the hypermedia spectrum where a conventional CRUD API (where the URLs loosely behave like resources that you can predictably interact with using the HTTP verbs) stands somewhere in between.

      My understanding is that in the most general sense, HATEOAS really describes the end-user-facing aspect of a web application. You’re presented with a UI that you can click around with a mouse using a piece of software (the browser) that doesn’t know anything specific to the application. So an SPA can still be considered a degenerate case of a HATEOAS application, where it defines its own hypermedia type (as represented by all the JS that implements it) which has a single inhabitant and that happens to be the SPA itself. Like a language that only has one language construct, a single keyword that is the_thing that can only describe a single thing. In this perspective, it is the whole surface are of the API services + the SPA that is HATEOAS, not the APIs the SPA interacts with, they’re an internal implementation detail.

      So, I think, HATEOAS is most useful in describing the architecture of the (sadly) old fashioned primarily server-side driven applications. It describes a way to completely decouple the client-side logic from the server-side logic, so that you don’t have to keep the programs running on two ends of a network coherent with each other, like it happens in the more recent SPA approach. In practice, since the client-side is completely decoupled, it gets manifested as the browser, so you just have to implement the server. If you have a number of APIs feeding an SPA (or a desktop application for that matter), you always have to worry about what happens if the user’s browser happens to be running the version from 2 releases ago, does it still behave coherently? What if the new version contains a new address field for the user object and the user sets her address from a new tab that’s running the latest version and then switches to another tab that’s still running an earlier version, goes to the user settings page and changes her name, will the ensuing PUT without the address field clobber the address she just set? You can of course come up with case by case measures to avoid issues like this whenever you can identify them (and actually care about them), but a HATEOAS application completely avoids this by completely decoupling the two.

      In my opinion, the ideal solution (that I never could find the circumstances to experiment with) is a little bit of everything. I think it’s great to try to decouple the server from the client, but only as far into the diminishing returns as is most pragmatic. You can implement a mostly-server-side application where the HTML drives the interaction, with a bunch of custom application-specific UI components implemented using JS, but those components don’t try to become applications, their purpose is to implement the missing functionality of the browser while maintaining the client-server decoupling. In the extreme cases, similar to the degenerate HATEOAS view of an SPA, you can have very featureful “domain specific widget”s that are in practice little SPAs of their own, so maybe they each get a separate team that develops both the API and the UI for it, but from the point of view of the rest of the application, they’re still (very fat) UI components. Next to all this, nothing stops you from having a bunch of CRUD or RPC endpoints specifically designed to support “third-party” agents that want to interact with the application state via a simple and stable (yet potentially lacking in features) API.

      1. 1

        Spot on!

        And I’d complement that there are hypermedia formats that carry UI/Layout hints (HTML, HXML) and are meant, primarily, for humans to drive, and formats that don’t hint at all on how to display and are primarily for machines to interact. But one don’t exclude the other. Machine interaction is programmed by humans in the end of the day.

        Taking HTML as an example: it is a domain-agnostic format; meaning that one can represent different domains with the same “language” (e.g.: this forum, a car rental system, medical records, insurance…). What differs is the “vocabulary”. For example, I can interact with and read a medical system, but I won’t be able to do anything useful because I don’t understand the vocabulary. This vocabulary can be in plain text (HTML text nodes) or tag attributes (such as class, rel…). If we take a step further and use, say, classes do describe our domain concepts, we can hook into these to write programs that interact with these systems. We do it all the time with UI testing frameworks and web crawlers. A lot of people complain about the instability of these, but I believe that the major reason is that these interactions hook into stylistic classes as opposed to domain-specific classes (which are a lot more stable, static). Granted, not everyone takes time to add domain-specific “annotations” to their HTML.

        If we take a format that does not carry any UI hints (a few more formats can be found here: https://gtramontina.com/h-factors/), Siren, for example, designing an interface for humans to interact would require the usage of properties like class (coincidence?) to depend on and conditionally render UI components.

        Please excuse any typos of if this sound incoherent – I wrote this and didn’t review (perhaps I’ll review later 😅).

      2. 2

        This explanation makes it seem like HATEOAS is not suitable for use as an RPC.

        That’s exactly right. Fielding introduced REST in his dissertation to describe the design principles behind the Web (mostly the HTTP 1.1 and URI specifications). While it could be applied to the design of another distributed hypermedia system, it really doesn’t make any claims to be a good idea for RPC or even your typical CRUD API.

        Back in 2008, Fielding wrote:

        I am getting frustrated by the number of people calling any HTTP-based interface a REST API. Today’s example is the —. That is RPC. It screams RPC.

        I don’t think I’ve seen any so-called “RESTful API” that is actually a REST API. You always have some out-of-band information that the developer uses to know how to parse API responses. HTML works because a human can choose which link to click, or which form to submit.

        It’s totally OK to have a CRUD API, or RPC or whatever, if you aren’t building a system with the same design constraints as the World Wide Web.

        1. 2

          I have to say that I didn’t understand HATEOAS at all until I read this article by the same author. Now I may misunderstand it completely, but at least I misunderstand it in the same way as the author of two of my favorite frameworks.

          1. 1

            The hard part is knowing what buttons should be pushed and why. That requires human intelligence.

            As I understand it, this part is handled by defining custom media types. That is how the semantic meaning of the response is communicated. Of course, this part happens out-of-band as well, with developers reading media type docs instead of REST http api docs. And you might then ask: So what have we gained? The idea is that now the server can change the names of its endpoints, and change workflows (link to link to link, etc), and so on, and the client will still work. Because the media types + the dynamic links between things are all you need to get stuff done.

            1. 1

              But it cannot change the meaning of the endpoints/links, because then the client will still break; so why would it want to change the names? Seems to me you’re just moving the lookup from the path to the media type?

              1. 2

                I think you’re using a single type for the whole application, so you effectively can change the meaning of the individual URIs without changing the meaning of the responses (which must still be composed of elements defined by your media type).

                This link has an example: http://www.amundsen.com/blog/archives/1041

          2. 7

            From experience, you cannot rely on url discovery to constrain operations by a client.

            If you return urls like /deposit/account_id/amount in a particular response, you can expect developers to construct those, even if your documentation explicitly says you have to chase through previous steps, have them returned to you and then follow them.

            i.e. developers don’t like walking multiple requests to discover which operations are currently permissible and their associated urls, they destructure them to find semantic meaning and then use them on the fly.

            I think the only way around this would be to have opaque urls (/opaque/some-uuid) which must be discovered by a client, and are mapped into the semantic spaces by some other layer. This seems like it doesn’t add value.

            1. 2

              I once had to write an API for consumption where constructing URLs wasn’t allowed, but obviously people did it anyway.

              So I wrote a middleware which for every URL in the application would I rewrite responses to use guides/hashes/random data on parts of the URL. It kept a cache in memory so you could hit the same nonsense URL repeatedly, but the cache was cleared nightly and on app restart.

              As long as consumers followed from the root document, they had no problems. And the ones which complained were politely pointed to the contract they signed about how the API was to be used.

            2. 5

              I was once on a hike with my wife in the high desert. We came to a junction and were uncertain which way to turn. We had a guide book. It had a map, but the map evidently didn’t line up with the world. Another hiker approached.

              “Excuse us,” I said. No response. “Excuse us—do you know which way to the lake?” “You should have a map,” they said, still walking unabated. “We do… well, we have a guidebook anyway, but doesn’t show this junction.” There was an awkward silence as they walked past us, avoiding eye contact. My wife and I looked at each other stunned. Then, without turning around, they shouted, “Welcome to the world!”

              I used to be a REST acolyte in the chorus singing the-world-is-doing-it-wrong. But it’s a tired song. If you give the world HATEOS and the world says, “Great! We can use this for CRUD RPCs,” what is there to say but, “Welcome to the world”?

              1. 1

                Should publish this comment as a blog post and link it.

              2. 2

                This is a decent explanation of what HATEOAS is, but it also kindof implicitly suggests that this is how APIs should be doing things. For a lot of applications you really want RPC, and “proper REST” just isn’t appropriate. But it became a buzzword, so you have folks trying to fudge it.

                Part of the problem is that for programmatic APIs, you need to sortof understand what the response is going to contain anyway; HATEOAS works well enough if you just want to crawl around (e.g. web browsers or search spiders), but when people talk about an API they usually have more specific applications in mind. You’d still need to know where to look for the link in the page, so you’re just introducing latency by forcing the caller to do an extra round trip to discover URLs, rather than making them predictable. You would be right to point out that this “isn’t REST,” but maybe that’s because REST isn’t what you want.

                I’d encourage anyone who hasn’t to also read https://twobithistory.org/2020/06/28/rest.html; some choice quotes:

                We remember Fielding’s dissertation now as the dissertation that introduced REST, but really the dissertation is about how much one-size-fits-all software architectures suck, and how you can better pick a software architecture appropriate for your needs. Only a single chapter of the dissertation is devoted to REST itself; much of the word count is spent on a taxonomy of alternative architectural styles that one could use for networked applications.

                This is the deep, deep irony of REST’s ubiquity today. REST gets blindly used for all sorts of networked applications now, but Fielding originally offered REST as an illustration of how to derive a software architecture tailored to an individual application’s particular needs.