"Why, after 8 years, I still like GraphQL sometimes in the right context" has been merged into this story.
  1. 142
    1. 33

      Excellent article, and it matches my own experience, which is as follows:

      It’s incredibly easy to build a ergonomic API with GraphQL. it’s then insaly hard to make it safe and performant.

      1. 2

        “The timeout response from our API means that your request would have returned ‘too much’ data. Limit your request to request less data. Guess. Yeah, double or even triple the number of requests, whatever, just don’t have expectations.”

      2. 32

        GraphQL, like microservices, is a business solution. The business problem is that you want your frontend team and backend team to work independently. With REST-ish, your frontend team has to request that the backend team give them endpoints X, Y, Z. That can be a bottleneck. GraphQL tries to solves this by just letting them request whatever they want, but as TFA points out this creates huge new problems from a technical point of view. I think a better solution is to work to reduce the bottleneck by integrating the frontend and backend teams. Even better, work in a full stack system unless you really know you need to separate frontend from backend.

        1. 4

          At my current workplace, I’m trying to push for FE devs to maintain their own BFF.

          1. 3

            The business problem is that you want your frontend team and backend team to work independently.

            How is that a problem? Sounds like a solution to me. But to which problem?

            And how is that business-related? Sounds like it’s organizations-related to me.

            Maybe the business problem is that companies think that hiring 5x more developers will help them deliver 5x more value. But this is wrong, because collaboration has a cost. So as an organizational solution they try to split these developers into smaller teams, to have smaller collaboration costs, but that doesn’t work, because teams still need to collaborate with one another. So they try to replace collaboration with contracts (APIs), but contracts need to be negotiated (endless API discussions), which is still a cost. So they end up inventing very liberal contracts where clients can discover data structures and do whatever they like with it (GraphQL), but even that has a cost: performance issues mean infrastructure costs.

            So yes, I agree with your proposed solution: as long as you can, just keep everything coupled, hire fewer developers but feed them with the culture of collaboration (Agile manifesto to the rescue), and enjoy the velocity that is only possible when simplicity is in the place.

            1. 1

              Amdahl: It’s not just a good idea, it’s the law.

          2. 24

            GraphQL makes sense at facebook because at their scale, dealing with the consequence of allowing all possible queries was less work that having to create dedicated enpoints for all the possible client queries.

            People completely missed the point of GraphQL, which is you TRADE flexibility for the client for added cost on the server.

            Indeed, with GraphQLs you delegate a lot of the query building to the client, hoping that it will not suddenly change your performance profile by being creative and that you will have not missed an obviously expensive use case coming.

            That’s a huge bet, especially given that GraphQL is expensive in the first place, and given that the more you grow the API in size, the less you can actually map the cartesian product of all request params.

            Which in a app and team as huge as facebooks’ made sense, especially since they have the so-called facebook apps that could do… anything.

            Most people adopted GraphQL out of hype, not because they needed it. Like they adopted microservices, SPA, or the cloud.

            You can make good use of all those things, but your use case must match it. Otherwise they are going to be costly.

            Remember when XML was the future ?

            https://www.bitecode.dev/p/hype-cycles

            1. 15

              GraphQL makes sense at facebook because at their scale, dealing with the consequence of allowing all possible queries was less work that having to create dedicated enpoints for all the possible client queries.

              No! That’s not what’s going on at all.

              GraphQL solves this with persisted queries, which let you lock down what queries you allow to a set of queries crafted by your developers. There’s no reason you need to allow arbitrary queries from unknown clients just because you chose to use GraphQL.

              It’s so frustrating how often people paint these simple, solved problems as huge, gaping, unfixable holes intrinsic to the GraphQL ecosystem.

                1. 4

                  It’s so frustrating how often people paint these simple, solved problems as huge, gaping, unfixable holes intrinsic to the GraphQL ecosystem.

                  I like it. I skim for those points and when I see them I know that I don’t need to read the article because the author is clueless and I won’t learn anything new.

                2. 9

                  I have been working on getting rid of a GraphQL BFF at my work, so I have been thinking about this.

                  It isn’t just because of their scale. It’s the mobile apps. Think about how you’d create those. The central feature is an infinite scroll of heterogeneous cards. Text, photo, video, link, ad (probably dozens of ad types in reality, including carousels) all have different data requirements to display properly. Because mobile connections have relatively high latency, you want fewer round trips, so you want to specify exactly what you need in the query. Thus, you wind up needing a query language that has sum types and fragments. If you just had the sum types, you’d have to craft this giant query up front, so that’s where the fragments come into play. You just declare what fragments you need, and then the individual components supply the specifics.

                  1. 4

                    I disagree. The advantage of graphql is in data intensive apis with changing clients. It provides a lot of the benefits promised by really actually HATEOAS rest.

                    As ever, by making the data intensive API less coupled to changes in the clients, you move some complexity around. The server now needs to do things like impose limits on query complexity, and coders need to use tools that don’t generate n+1 queries everywhere.

                  2. 10

                    My experience with GraphQL schemas or GRPC:

                    Client code generation always works.

                    My experience with OpenAPI schema or JSON schema:

                    Keep it simple or you will face code generation issues. Issues vary from language to language.

                    Just one indication that these REST schemas are not simpler. In fact, they try something more complicated than GraphQL and GRPC: make a schema language for all REST APIs which is much less constrained.

                    Then there is also the shape of the schema files. I find the ones for GraphQL and GRPC more readable and human-friendly than those mentioned above.

                    (That is only one point in this discussion, there are obviously other factors )

                    1. 9

                      GRPC code generation works, I’ll give you that. It’s the worst api client code I’ve ever interacted with in my career, but it is technically functional.

                      1. 1

                        This matches my experience, too. I looked at OpenAPI code generation options for TypeScript a couple years ago, and it was still unfortunately janky.

                        There’s too much you can describe with it that doesn’t lend itself well to code generation.

                        The best OpenAPI code generator I ever used was https://github.com/guardrail-dev/guardrail for Java and Scala, but I haven’t found code generators for other languages with similar polish.

                      2. 8

                        I missed the GraphQL hype, but the fact that there are these types of issues still strikes me as pretty insane. These feel like core design flaws endemic to 1.0-level tech that should be mitigated as time passes, and even then, the design is highly suspicious for admitting these types of things.

                        Ah, I see now. Pushed by FB, and lapped up by everyone else. If you play with fire, don’t complain when you get burned. More charitably, consider this a good lesson in engineering: not everything put out by the almighty FAANGs is automatically good, or useful for your particular use case.

                        1. 14

                          If you play with fire, don’t complain when you get burned.

                          Often the decision to play with fire is not made by the people who end up getting burned. In our case it was a top-down directive from engineering leadership–no one on my team actually wanted this, but now we have to add workarounds for the fact that GraphQL’s integers are all limited to signed 32-bit values.

                          1. 3

                            That’s fair. Been lucky to work in orgs small enough where I had some say in what and how we did things, but that’s not everywhere.

                            1. 3

                              This was how it played out at the ~1,000–2,000-strong company I was at, too; no-one particularly asked for it, but we got a new VP in engineering who declared it was “absolutely the future”, and so … ¯\_(ツ)_/¯

                          2. 7

                            When $employer switched from REST to GraphQL, I called it a “sidegrade”. They just made a change which was neither an improvement (upgrade) nor a degredation (downgrade). Years later, I still hold this position.

                            Also: One of the selling points of GraphQL is that the calling side has the freedom to craft whatever query they want. Well, we found out that this could be abused for 1) getting at stuff you shouldn’t be allowed to, and 2) causing performance problems beyond just the one caller’s one GraphQL call. So what did they do? “Hey look, there’s this GraphQL feature called Persisted Queries, where we can restrict queries to a list we specify” So you’re using a tech with this selling point, then completely neutralizing that selling point? shakes head

                            Not to mention the performance cost of checking authorization at field level (as mentioned in the article). Devs began disabling authorization checks on fields case-by-case just to get their Kanban card done!

                            I wouldn’t use GraphQL for a project I’m in charge of without some serious convincing.

                            1. 2

                              They just made a change which was neither an improvement (upgrade) nor a degredation (downgrade).

                              In a business context, every change that has a migration cost but is not an upgrade is a downgrade, because the money that was spent to do the change was not spent to create business value.

                              1. 1

                                Thanks for the mention of Persisted Queries, that was what I mentioned but I couldn’t remember the name.

                              2. 7

                                c&ping my hackernews response:

                                For professional developers that know more or less what they are doing, most of those things are absolute non-issues. Let’s list it:

                                if you expose a fully self documenting query API to all clients, you better be damn sure that every field is authorised against the current user appropriately to the context in which that field is being fetched.

                                The same is true for a “regular http API” [now abbreviated as “rest API”] (in case it’s not clear: imagine how the http API would look like and think about if it would be somehow magically secure by default. Spoiler: it won’t). Security by obscurity was never a great thing. It can help, but it’s never sufficient on its own.

                                Compare this to the REST world where generally speaking you would authorise every endpoint, a far smaller task.

                                Just think about how many endpoints you would actually need to cover all combinations that the graphql API offers.

                                Rate limiting: With GraphQL we cannot assume that all requests are equally hard on the server. (…)

                                It’s true. However: a rest API only makes that easier because it’s less flexible and you have to manually create one endpoint for each of the various graphql API combinations. So instead, a classical graphql mitigation that the author does not mention is to simply require the client (or some clients) to only use fixed (predefined) queries. That is then the same as what a rest API does, problem solved. So graphql is not worse of here, but it can be much better.

                                Query parsing

                                This point is actually fair, except for the claim that there is no equivalent in REST. That is simply not true, as there have been many cases where rest frameworks worked with json and could be ddos’d by using large json numbers.

                                Performance: When it comes to performance in GraphQL people often talk about it’s incompatibility with HTTP caching. For me personally, this has not been an issue.

                                Funny, because that is indeed one factual disadvantage of graphql.

                                Data fetching and the N+1 problem: TLDR: if a field resolver hits an external data source such as a DB or HTTP API, and it is nested in a list containing N items, it will do those calls N times.

                                No, this is wrong. But first: this problem exists in a rest API as well, but worse: instead of N+1 queries to the DB, it will N+1 http requests AND N+1 db requests. But with graphql and some libraries (or some handwritten code) it is comparibly easy to write resolvers that are smart and “gather” all requests on the same query level and then execute them in one go. This is harder to do in a http api because you definitely have to do this by hand (as the auther describes).

                                GraphQL discourages breaking changes and provides no tools to deal with them.

                                What? Comon. In a rest API there is no means to do anything against breaking changes by default. Graphql at least comes with builtin deprecation (but not versioning though).

                                Reliance on HTTP response codes turns up everywhere in tooling, so dealing with the fact that 200 can mean everything from everything is Ok through to everything is down can be quite annoying.

                                True, but that can be adjusted. I did that and made it so that if all queries failed due to a user-error, the request will return 400, otherwise 204 if they partly failed.

                                Fetching all your data in one query in the HTTP 2+ age is often not beneficial to response time, in fact it will worsen it if your server is not parallelised

                                Okay, I’m starting to be snarky now: maybe choose a better language or server then, don’t blame it on graphql.

                                And actually, many real problems of graphql don’t even get mentioned. For example: graphql has no builtin map type, which is very annoying. Or, it has union types in the response, but you can’t use the same types as inputs due to a lack of tagging-concept.

                                And so on. My conclusion: the auther is actually fairly inexperienced when it comes to graphql and they probably had a bad experience partly due to using ruby.

                                My own judgement: graphql is great except for very very specific cases, especially in projects that require huge amounts of servers and have a high level of stability in the API. I would always default to graphql unless the requirements speak against it.

                                1. 5

                                  And actually, many real problems of graphql don’t even get mentioned. For example: graphql has no builtin map type, which is very annoying. Or, it has union types in the response, but you can’t use the same types as inputs due to a lack of tagging-concept.

                                  These are exactly the shortcomings I expected to see (in addition to the N+1 thing). It is extremely frustrating to have a decent type system on the frontend (TypeScript), a less expressive type system on the backend (Python, in my case), and yet still be constantly hamstrung by the even simpler layer in-between.

                                  1. 4

                                    For professional developers that know more or less what they are doing

                                    What does that make the people commenting here…?

                                    1. 3

                                      What I wanted to express with this is that if someone who is developing software for fun as their hobby and are inexperienced and try out graphql, they might fall into these traps, but for full-time-developers I think it’s very rare that that happens.

                                  2. 6

                                    We have a 9000 line schema file, so yes I’m also very much over GraphQL, but getting over it is going to take years.

                                    1. 7

                                      It could be worse. You could have 9000 lines of ad-hoc SQL scripts.

                                      1. 3

                                        Not GraphQL but we started using Prisma for connecting with our database and the ~900k line typescript file it generated from our nearly 20k line schema file causes my IDE to consume nearly 6GB of RAM just trying to parse it.

                                        1. 2

                                          Maybe once you reach this level of “scale”, you have already lost.

                                          1. 1

                                            The thing is, yes we have ~400 tables but the IDE has zero issue being aware of the table column types and their relationships when parsing the Laravel models.

                                            The node application only needs to know about 12 of those tables and so as much as it pains me to delete the majority of the schema file that’s what I ended up doing.

                                            I have given up on the idea that Prisma or GraphQL can replace what we have already and in the months since that project completed my stance on these technologies is to opt for the least complex, mature option available.

                                        1. 5

                                          There was a talk from GOTO Aarhus 2023 that somewhat glossed over how GQL didn’t fix the friction of developing using REST, it only moved the friction somewhere else. I couldn’t agree more.

                                          I’d also add that from the client side I am over the hype. I’ve spent 2 years now working on a frontend that uses GQL and I find it to be more difficult to maintain and use than plain REST or even OpenAPI REST. Especially in a system that’s always evolving. Instead of asking the backend team for a new REST endpoint I now have to ask for new GQL queries & mutations. Same friction, grammar.

                                          1. 5

                                            There is a way to use GQL that avoids a lot of these problems. Just use it as a pretty good specification and schema language with pretty good tooling and libraries. But don’t allow clients to send arbitrary queries. The client team can suggest a new query, but the backend team has to approve it and add it to the accepted list.

                                            And I would argue this is the only way to have a public GQL API. It’s just too flexible to be practical, otherwise (as explained in the article).

                                            1. 4

                                              I am currently working on a GraphQL API and I have to say I basically agree with this article. I wrote a small blog article about the N+1 problem and DataLoaders two days ago.

                                              In my case I still think it is OK because:

                                              • this API is almost fully authenticated (SaaS backend) so most attacks can only be performed with user credentials ;
                                              • this is an early product and it speeds up development (because we don’t do full stack) ;
                                              • it is not a public API but only for use by our own clients.

                                              My plan to remediate all this is at some point to aggressively validate (whitelist) all queries used in production. In development it’s fine, but in production if your query is not whitelisted it is outright rejected. This solves most of the issues by making it closer to the REST approach.

                                              1. 4

                                                this API is almost fully authenticated (SaaS backend) so most attacks can only be performed with user credentials

                                                The problem is not authentication, but authorization.

                                                Even within your SaaS, I don’t suppose a customer should be able to access data from other customers. Or you might even have different levels of access within the same org (at some point as your product develops).

                                                As the article eloquently explains, this is a fairly easy problem to solve with REST APIs, but with a GraphQL API is very easy to create a source for un-athorozied information access trough nested queries.

                                                1. 2

                                                  Of course, that does not help with authorization problems. It does help with other problems described in the post such as malicious crafted queries. One of my users could still attempt it, but we would at least know who they are. OTOH unauthenticated GraphQL APIs have to endure a ton of bots attempting to automatically attack them.

                                                  As for authorization, the product I work on currently is early and has a straightforward schema and very simple roles, so it is manageable. My previous company has a much more complex model (and a REST API) so I can understand the problems described in the post.

                                                  To be clear I didn’t advocate for GraphQL, I had actually written the first version of the API using REST. But we ended up deciding that it is a necessary trade off to speed up iteration on the frontend. My favorite approach is to have full stack developers who just write the REST endpoints they need themselves, but here it was not an option.

                                              2. 4

                                                All valid complaints! But I feel like a majority of these problems could be solved by a frontend build process that either registered, cryptographically signed, or otherwise “blessed” any queries used in the frontend codebase. Most GraphQL APIs aren’t meant to allow random users to construct queries, they’re meant to enable frontend devs to more easily access backend functionality and data.

                                                1. 4

                                                  I’ll cop to pushing for GraphQL at a previous job, and I’ll admit that authorization and performance were pretty tricky. However, I still think it made sense for us at the time for a couple of reasons: 1. We intended to build at least two new and very different clients on top of a mature, complex database schema with a lot of 3nf, and 2. Our developers were divided between experts in Vue who, in addition to myself, would be responsible for the clients, and experts in .NET who, with my gratitude, figured out the authorization and performance problems.

                                                  If HTMX had been available at the time, I might have recommended using it to build everything in .NET. But it was not. Even if it had been, there were several interactive features that we would have either had to say no to or kludge together with noodley imperative DOM code. The only real alternative under consideration at the time was Blazor. Most of us did not take it seriously because its performance was absolutely atrocious.

                                                  It’s also useful to put things in historical context, albeit recent. I think OpenAPI emerged from the relative obscurity of its proprietary Swagger origins at about the same time that GraphQL was gaining traction. Although a lot of warts in GraphQL have emerged since then, cherry-picking properties from nested data structures had a big appeal over the rigid REST APIs that GraphQL was intended to replace. The warts could have been fixed, too, but my impression is that GraphQL lost a bit of its steam whereas OpenAPI is basically everywhere now.

                                                  1. 3

                                                    The author’s first point is about GraphQL’s allegedly bigger attack surface. Again this focuses more on completely public GraphQL APIs which are relatively rare

                                                    I don’t think this is right. Offering a public vs private API doesn’t have much effect on the size of the attack surface: you can have a completely undocumented, “private” GraphQL API and you still need to worry about attackers who sniff the traffic to your web/mobile frontend, grab an auth token and start trying to find security bugs that leak another user’s data.

                                                    1. 2

                                                      The referenced first point that this comment was in response to was talking about the attack surface created by allowing arbitrary queries, which the author of this article is saying is only a hard problem if you have a public api. A private API would use persisted queries and then the attack surface is just exactly the same as a hand-crafted REST API.

                                                      The author wasn’t suggesting that private APIs don’t have to be secure.

                                                      1. 2

                                                        I assume he really means not reachable from the public internet. And only allowing persisted (so basically allowlisted) queries or query templates.

                                                        1. 1

                                                          I did exactly this with Medium’s undocumented API for https://scribe.rip (minus trying to find other user’s data)

                                                        2. 2

                                                          What are persisted queries? Is it related to caching the output?

                                                          1. 2

                                                            I believe it’s a trick where you create a big complex GraphQL query and then give it a name and add it to a server-side allow-list - so you keep the ability of GraphQL to define complex fetches as a single query while removing the risk of clients inventing arbitrarily expensive new queries without collaborating with the backend team a tiny bit to avoid expensive surprises.

                                                            1. 2

                                                              I’m not sure if I could agree with all the criticisms of GQL in this article. Out of all the features GQL supports, I would say the ability to solve overfetching and underfetching are most critical. When I use GQL at work, I either don’t use the Graph in GQL or at least enforce the depth of graph to a minimal (max 2) if I can’t stop the schema designed that way.

                                                              Anything else, we’ll just be reinventing GQL to some extend.