1. 11
  1.  

  2. 4

    I agree that client-side state management is a very interesting design space, it’s not trivial at all. It’s also unavoidable - there is client state even without an SPA, it’s just a matter of who controls and manages that state. I would also point out that it’s even harder than is presented in this article:

    If I load the book from ViewBook and the data has changed by the time I’ve clicked EditBook, we have a problem

    Consider this other possible order of events:

    • User 1 clicks EditBook
    • User 2 clicks EditBook, in separate session
    • User 2 makes changes, and saves form
    • User1 makes changes, and saves form

    The changes that User 2 made were not visible to User 1, and User 1’s changes might clobber User 2’s. There is nothing that you can do state-wise on the client to prevent this, you need to have the notion of system state, which is state of the entire system. With system state, you can do something like track the last updated_at time, and don’t allow a resource to be updated if the client that’s updating it has stale data. Or you just let the clobber happen, and let the people with shared access to the resource sort it out.

    That’s why I find this confusing:

    server state is not application state, you should not treat your application store like a cache and you should not update cached data from user input.

    The first part is 100% true. I’m assuming by “application state,” they mean client state, and then yes - client state is not application state. If you view a web application as a global system, it has both client and database state, and both can vary independently, i.e. if it were an object it would look like this:

    class System {
      client: ClientState;
      data: DataState;
    
      ...
    }
    

    So no matter how you slice it, the client state is a cache of the data state. Even if you fetch data on every single user interaction, the data is cached between interactions. That’s just a more frequent cache update strategy, but it doesn’t eliminate problems. Optimistic updating is another totally valid strategy, it just comes with other problems to solve.

    1. 2

      I actually agree with you on almost everything, this is a very nuanced topic.

      That’s just a more frequent cache update strategy, but it doesn’t eliminate problems

      This “strategy” isn’t meant to solve everything, but its a teaser to think beyond redux-style application/cache global store, to simplify your code while making it more reliable. It doesn’t fundamentally solve the problem, but I think its a worthwhile improvement without having to go into the enormous complexities of CRDTs or something similar.

      If you view a web application as a global system, it has both client and database state, and both can vary independently

      This is true, and actually like you pointed out, it really matters how you model your client side state. IE. don’t just mirror your server state.

      So to drill in, what I meant by “you should not update cached data from user input” is your system should have the notion to distinguish between data that is what the server gave back, and data that has been modified by the user.

      With a system similar to this, it becomes trivial to detect client side interaction & let the component developer switch to “dirty” mode. Its not going to go dirty mode when a different user interacts… but its a whole lot better than nothing. Hopefully this article shows a pretty straightforward way to have that kind of detection built in.

      So to recap, your client side state is a lie, so deal with it accordingly, avoid sharing it with multiple components, avoid persisting it beyond its life cycle, essentially do not trust it.

      1. 3

        but its a teaser to think beyond redux-style application/cache global store

        100%, like I said, there’s a large design space out there of how to manage client state. And like you said, you can go all the way to CRDTs or gossip protocols, or go simpler and fetch data from the server with each click.

        LiveView-type libraries seem to be a bit trendy as of late, but they’re also a testament to the inherent difficulty here. Under the hood their pretty complex, what with pub sub systems and sockets and sending DOM patches over the wire.

        I think “managed” state libraries are particularly interesting in general, like React Query, Apollo, or even the custom hook that you presented here. They allow a simpler programming model at the component level, in exchange for committing to their worldview.

        your client side state is a lie

        And this is true of all caches!

        Anyway, I’m happy you wrote on this topic. It seems like anti-SPA sentiment is also pretty trendy recently, especially here, but the fact that they open up this design space is why I can’t quit them.

    2. 3

      All state in a distributed system is local, and it’s all a lie. When you’ve made a change locally but not yet saved it, the server’s state is a lie too. Distributed systems do not have linear causality, they are relativistic.

      I had a hard time following your argument because it’s shown as code in an API I don’t know (React?) but I believe you’re saying client state should be re-fetched on every operation. I disagree (or, I agree this is sometimes best, but not always.). It’s slower, puts more load on the server, causes errors if the server isn’t reachable ant that moment, and basically trades one lie for another. The client state can always go out of sync with the servers, not matter how often you try to catch up; refreshing more often makes it less common, but it still happens and you still have to deal with it.

      It is perfectly fine to let the client and server state drift. You just have to have mechanisms to detect and resolve conflicts.

      1. 1

        Distributed systems do not have linear causality, they are relativistic.

        I happened to just write about how to think about and model non-determinism, which strongly applies to distributed systems. I wouldn’t say that distributed systems have any kind of emergent behavior or any behavior that we can’t understand ahead of time - we can, it just requires having a mental model that handles their complexities.

        TLA+, for example, is one such model, and it’s been used both to prove and model-check tons of distributed algorithms and applications. I guess my point is, it sounded like you’re saying that distributed systems are some kind of magical entity, and I view them as just as deterministic as anything else that can be modeled with logic.

        1. 1

          I wasn’t saying that at all. Relativity isn’t magic. In both Special Relativity and distributed systems, there is no global ordering of events at different locations. Events can be observed in different orders. But there is limited ordering that can be applied. Butler Lampson wrote some good stuff about this back in the 70s.

          1. 1

            I haven’t read them, but have read a lot from Leslie Lamport. For example, in Time, Clocks, and the Ordering of Events in a Distributed System, you can arrive at global ordering of events of different nodes, provided you’ve set them up with certain conditions.

            So saying “there is no global ordering of events at different locations” does sound to me like a view that distributed systems are somehow magical.

            I’ll definitely check out Butler Lampson to see their perspective too.

            1. 1

              Sorry, I meant Lamport, not Lampson!

              I still don’t know why you think I’m talking about magic. “No global ordering of events” is an uncontroversial statement of fact about distributed systems. You can create annotations on events that can create partial orders, though, which is often useful enough.

              1. 2

                It might sound silly, but you used relativistic in a literal sense (relative to local observation), whereas I took it to mean “something we can’t predict / understand.” I like to always share resources for removing the magic from software, but in this case you already know that.

      2. 2

        I wanted to say that this article makes quite a few assumptions, but on the other hand, they might be valid assumptions.

        For example, I would assume any non-junior web app developer to know that all the information you have on the client is always out of date, the moment you get it from the server. Like @amw-zero says, it’s just a cache. But then again, this would be an assumption on my part.

        Another example is the list of problems stated relatively early (stale data, and no error handling on save etc). AT the end of the article, there’s a clear lie, “data is always fresh”, and to paraphrase a bit, all the problems are solved. Yes, the data is fresh unless you went to grab coffee in the meantime and somebody else updated your book. Or you forgot about your other tab. Or any number of things. The problems stated above are not solved.

        Then there are a few other obvious assumptions like stating that the budget for non-user-facing pages is at least 2 seconds. In today’s world? For fetching a simple resource? That’s a pretty lousy message - not because I’m trying to gate people with “your performance sucks”, but simply because it shouldn’t take 2 seconds for a regular CRUD API to finish requests. Because when you switch to a mobile network, you can assume it’s 10 seconds now, what now? Or even if it’s not via clients - your internal API is blocking another, external-facing API also got 2 seconds slower.

        What is described in the article looks like “global state is wrong, push your state lower in the stack”, but then this would not always apply to all SPAs, just to React and the like. Plus the example is not about the state but about cache, perhaps that should either be clarified or a better example found.

        I would say that there are a few valuable points in the article, but I would like it better if they were upfront about their assumptions, and about the limits to what the “solution” is solving. OP, this is a nice article, but I think it would be worth it to revisit it and clarify some of the points.

        1. 1

          I was being a little loose with my wording.

          The theme of the article is essentially this

          • Avoid caching until it becomes a measurable problem for your users.
          • Avoid putting cached data in global state at almost any cost
          • Design patterns to show the user what state of the data they’re looking at (modified, created, fetched, locally dirty)

          My points on 500ms or 2s can be essentially discarded if you look at them absolutely, but i work on a project right now where i know the user will be on a high speed connection on a computer, and where the data integrity is critical. I suspect im not the only one working on a similar kind of project.

          Id say this is the main assumption, you are working on a project where safety is more important than performance.

          But i think these general patterns apply beyond that, just with different measurements & thresholds.

          1. 2

            Right, those are some of the assumptions I would like to see in the article. If people know what you’re talking about, they can probably arrive there themselves, but if junior devs with little experience take your approach, they might end up just taking it for granted.

            But yeah, keep up the good work and write more. It’s not useless and the points you made do hold up some scrutiny.

        2. 1

          Visual components usually have some overlapping state they need to represent. Writing HTTP requests directly in them can (and from my experience, always does) lead to disaster:

          1. Each component becomes its own client, creating redundant network traffic and load on the server.
          2. Component states go stale, but not uniformly, requiring a full page refresh to resolve.

          Say, for example, you have a BooksList component rendered in a side bar next to the main content where ViewBook or EditBook are rendered. On first page load, you’re doubling up on your requests for the same data. If you CRUD a book, the BooksList goes stale.

          This is where stores (a.k.a. state containers) come in. And you’re right: they’re not great at managing volatile, server-hosted data. Assuming the store is Redux, we’d need something like redux-thunk and tell it to refetch the data on every component render. One need merely to forget this less than obvious step for stale state to creep back into the user experience. RTK claims to solve this with automatic re-fetching. It looks complicated and I’ve never tried it.

          If I were building an app like this, I’d consider using GraphQL. Apollo’s GraphQL client includes a centralized, configurable cache. You can tell it what properties to use to identify cache entries. (By default, it looks for an id property.) You can also set a cache policy by object type at the application layer or on request at the component layer. Since GraphQL objects are strongly typed and labeled, the data can even come from different endpoints and still be synchronized elegantly and automatically across all your components. This allows you to write queries in components with a cache policy of your choice. I’ve used this myself and it’s very effective.

          I’m probably sounding like a broken record, but there are some newer meta-frameworks that abstract browser/server data synchronization and tread new much newer ground in this problem domain. I find the islands of interactivity concept very compelling; there’s also Qwik’s resumability concept; and of course there’s HTML-over-the-wire. The term was popularized by Hotwire but pioneered by Phoenix as LiveView. HTML-over-the-wire is also available to non-Ruby/BEAM runtimes in HTMX. The advantage of these frameworks in this case is they put caching policies back in the control of the server and still have the benefit of at least some of the interactivity that made SPAs compelling in the first place.

          Perhaps less importantly, the title is a bit hyperbolic. “Lie” implies intentional deceit, which I don’t think lines up with what you write about in the rest of the article. As @snej pointed out, if you were to have a CreateBook feature, the server state becomes the “liar” until the client state is saved to the server. A less exciting but more accurate title would be, Distributed systems are hard. SPA/server synchronization is but one permutation of that problem.

          1. 1

            I put LiveView / Hotwire in the same camp as Apollo. It’s just managed client state. Which part of the state is stored where, and when state is transferred is all a design space. It’s also funny that you say that it abstracts client/server data sync, because I’ve been playing around with it recently and I can’t believe how complicated it is, with all the setup for pub sub / sockets for the state updates. I’m happy to see diversity in the client state management ecosystem, but I’m not seeing the simplicity value proposition there.