I’m not meaning to dissuade the SourceHut developer(s) from using GraphQL, but I am still on the fence about GraphQL, having used it at $employer for, oh, 4-ish months now. GraphQL is a shift in thinking, and there are sometimes several ways to provide data for a given business need.
In any system with models and DB tables, some tables/classes/types have more than one link to them (in an ERD). For example, an Article could have an Author, but it could also be in a MagazineIssue. When you have multiple pathways to reach a given entity, you (as the developer of a given use case, web page, whatever) have to decide which path you’re going to code in your query. Carrying on with the example schema: It could be currentUser -> articlesWritten; or it could be currentUser -> magazinesSubscribedTo -> issues -> articles. This kind of multiplicity of paths increases the complexity of the system, and makes the system more challenging to work with.
Authorization with GraphQL is quite non-trivial. Suppose you had: Authors organize with Folders, in which they put Articles; and then Readers read Articles. A simple schema would be to have this heirarchy: authors -> folders -> articles. Folders would have-many articles, and articles would belong-to folders. But if you have a query that needs to get articles for a reader to read, if you stick with this simple schema, then you have to include folders in the query. But readers should not have access to or knowledge about the folder the author has put the article in. Or maybe the reader doesn’t care about who wrote what, and wants to get articles on a given topic or tag, like “politics”. Worse yet: a reader [UI] should not be able to query “upstream”, and get at article -> folder -> author -> author.email. “So don’t put upstream association links in your schema” you might say. But is it so simple? What if a pair of entities are not in a simple, obvious hierarchy where one is “higher” than the other. Say, a Person model with a friendsWith association.
I grant that there may be some correct way(s) to go about things to deal with issues like the above, but GraphQL doesn’t come with guardrails to prevent problematic schemas, and I think it’s not only possible, but somewhat likely for a team new to GraphQL to plant footguns and landmines in parts of their schema.
So, yeah. I’m still on the fence about this GraphQL thing.
Indeed. At GitHub we had a VP who was All About GraphQL – we GraphQL’d everything, over a period of a couple years. By the time I left, there were RFCs circulating about how we’d undo all the damage done :|
Thanks for these comments. I would appreciate more specific feedback, if you have the time to provide some. You can use our equivalent of the GraphQL playground here:
Expand the text at the bottom to see the whole schema for git.sr.ht. This went through many design revisions, and I’m personally pretty satisfied with the results.
Also, specifically on the subject of authorization, the library we’re using provides good primitives for locking down access to specific paths under specific conditions, and combined with a re-application of some techniques we developed in Python, this is basically a non-problem for us. I’m still working on an improved approach to authentication, but none of the open questions have anything to do with GraphQL.
GraphQL does not solve many of the problems I would have hoped it would solve. It does not represent, in my opinion, the ultimate answer to the question of how we build a good web API.
Which additional problems would you like to see solved? In which direction might the ultimate answer lie?
Same here, except we’ve had lackluster success with GraphQL for 18-ish months instead of 4.
I could see it being a big step up for data models that are oriented primarily around a graph, but I’m having a really hard time seeing how sourcehut’s would work this way, outside of the git commits themselves. I’d be interested in reading more about the dissatisfaction with REST, because everything we’ve done so far with GraphQL could have been done better with REST along with something like JSON-schema.
For small projects, I’ve started just using json-rpc (what’s old is new again!) with a well defined schema (with validation). Nice side effect is the schema gets used as part of the documentation generation.
Saves lots of time noodling around with REST. I probably wouldn’t recommend it for large projects though, as it is a bit harder to version apis (a bit more “client fragile”).
I’ve used GraphQL on a few projects, and I thought it was ok. It certainly has its own set of problems.
Disclaimer: I have only dabbled with graphql and never used it professionally.
Let’s compare this to a REST API (or what would be a better reference point?). In a REST API, you also have the n+1 problem if your entities refer to other entities.
It is just more common to solve this on the client rather than on the server – which is worse due to higher roundtrips. Alternatively, you provide a denormalized view in your API that provides all data for, let’s say a screen or page, on the server. That means that you have to change your server in lockstep with the client – and also worry about versioning. In the server code you can either use a generalized solution (much like with GraphQL) or a specialized version because it only has to work with this one query. In my opinion, a generalized version is also usually the preferred option but it is nice to have the option for hand-optimization.
Therefore I think that solving the n+1 problem in GraphQL requires some thought - but also in a REST API. There are good enough solutions. On one hand, since you have more information in your schema, you actually have some more options than in REST. On the other hand, REST is more constrained so you can work with more specialized solutions.
One thing to consider is that “just have to make a type” relies on the fact that GraphQL fields can have parameters. This introduces a challenge for caching, since those parameters become part of your cache key. If you’re trying to take advantage of a normalizing graph cache on your client, the cache engine needs to be aware of the pagination primitives.
Not advocating for REST, since you have a similar problem there. However, REST makes some static decisions about what fields are included and so the cache key is reasonably a URL instead of a GraphQL AST + variables map. If you decide you don’t want a normalizing cache, but do want a naive document cache ah la REST, managed by the browser, you can’t get that unless you have named queries stored on the server. Then you can query for /query/someName?x=1&y=2 sort of thing.
I guess my point is that nothing with GraphQL is “just” anything. You’re discarding a lot of builtin stuff from the browser and have to recreate it yourself through cooperation of client JavaScript and specialized server code. Whether or not that’s worthwhile really depends on what you’re building.
When you have multiple pathways to reach a given entity, you (as the developer of a given use case, web page, whatever) have to decide which path you’re going to code in your query
Why is this a bad thing? Presumably you always fetch articles using the same ArticleResolver, this doesn’t take any extra effort to support on the backend. And the multiplicity of paths is the nature of graphs! I don’t see why it makes the front-end more complicated.
Authorization with GraphQL is quite non-trivial
The best advice I’ve seen for this is: don’t do authorization in GraphQL. You have some system which accepts and resolves GraphQL queries, it fetches data from other systems, those other systems should be doing the authorization. I haven’t had a chance to apply this advice in practice, I think that for some use-cases postgres row-level security would do wonders here, but it seems reasonable to me and it would solve most of your complaints. If your GraphQL resolve can’t even see author.email, then it’s no problem at all to include that link in your schema, it’ll only resolve if the user is allowed to see that email.
The best advice I’ve seen for this is: don’t do authorization in GraphQL. You have some system which accepts and resolves GraphQL queries, it fetches data from other systems, those other systems should be doing the authorization. I haven’t had a chance to apply this advice in practice, I think that for some use-cases postgres row-level security would do wonders here, but it seems reasonable to me and it would solve most of your complaints. If your GraphQL resolve can’t even see author.email, then it’s no problem at all to include that link in your schema, it’ll only resolve if the user is allowed to see that email.
“just not resolve”: But how does that actually play out in practice?
Your query which requested 40 fields in a hierarchical structure gets 34 values back – and just has to deal with it. Your frontend might have Typescript types corresponding to some models in your system. So then some places, which query under certain contexts, get 38 values back, other places get 34, and so on. So does your Typescript now just have to have everything marked with a ? suffix (not required, null is permissible)? Suppose you have an Angular component which takes an Article type prop, or gets it from NgRX, or wherever. Now, your component cannot just populate its HTML template with all known fields of the Article type, like author name, article title, date published, etc. Sometimes that data will just not be there, “because auth”. So will it have to have *ngIf all over the place? Or just have {{ someField }} interpolations peppered throughout that just render as a blank space – sometimes? Should you have a different Angular component for each context that this type appears?
And this is all just talking about a simple single defined GraphQL type and its fields. Things get even more fun when you have 3+ levels of hierarchy, and arrays of results. What if the model hierarchy is A -> B -> C, and your currentUser has access to the queried A, and the queried C, but not the queried B? Or, because of lack of authorization, it only has access to a subset of the array of values of a given field. So, your frontend now displays a partial set of results – without writing any error in your error log. Or maybe an auth error on an array field makes it return no records at all. Or maybe a null instead of []. So Engineer Alice is expecting to see 10 records, just like she saw on Engineer Bob’s screen when they were pairing yesterday. But today, when she goes to work in her branch, she only sees 7 – and cannot easily see why. “Well, your devs are supposed to know your (whole) schema”, we might say. That’s ideal, but in a team beyond a certain size, that can’t be assured (and, beyond a certain size after that, shouldn’t be expected).
“Okay, so don’t make auth problems silent, make them noisy” we might say. Yes, I agree. So how noisy should these be? Should a single auth problem, in one single node in a 7-level hierarchy of 82 fields in a GraphQL query cause the entire query to return non-200? Teams have to decide about that, and some teams won’t agree to that approach. For example, at $current_employer, our (non-unanimous) agreement at the moment is to write to error logs, dispatch to $third_party_error_service, but return null and return success (200) for the query. But, in so doing, we are still having some engineers come into the Slack channel(s) and reach out for help because they don’t understand why their GraphQL isn’t working any more, or why such and such page used to return whatever yesterday, but is returning something else today.
I’m just trying to point out that GraphQL isn’t problem-free. There are warts and challenges that need to be acknowledged and addressed before success can be had.
I put my comment out there knowing it was wrong and hoping someone would come along to tell me what I was wrong about, but I don’t think this particular criticism holds water.
You’re pointing out that if the frontend tries to access data it may or not be authorized to see it… may or not see it. The ? is essential complexity that is no fault of GraphQL, REST will have the same problem.
I think you’re arguing that:
In GraphQL all of your data fetches happen at once, so it’s not possible to 403 just the missing pieces, or feasible to 403 the whole request. We agree here!
It’s not feasible to expect every engineer to know how the schema works. I agree, and think that’s exactly why types were invented. If User.email is nullable then engineers using User should expect that there will sometimes not be an email. Going further, you might want to make some types self-documenting. If it’s expected they’ll often fail, due to lack of authorization, you might return a union type: ViewableUser | CloakedUser.
Silently not returning data you can’t see is a recipe for confusion. That’s sometimes true, I agree! However, say you’re writing a task manager and the query asks for a list of all tasks. There, it’s pretty intuitive that the query would only return the tasks you’re allowed to see. I would expect that in most situations there’s an intuitive solution, authorization is a concept that we’ve all spent a lot of time getting used to.
There’s another technique you might use, which is to return the errors inline with the query. Return both the data and also the errors. If some field fails to resolve you might leave it null and also bundle an error in with the response explaining why the field is null.
Which of these techniques you go for will depend on the situation, and probably there are some situations where none of them are satisfying, but I don’t think it’s reasonable to expect any technology to be problem-free. You’re holding GraphQL up to a very high standard here!
Context: After dablling, I think that GraphQL is usually better than a REST API – and I want to find out if I am wrong. I am fine if it doesn’t solve all problems but it shouldn’t make things worse.
Back to your comment: I think, it is normal that your data schema has to take authorization constraints into consideration. Agreed, that does not make it always easy.
If you cannot always return data, you have to make it optional. Most likely, you don’t have to do this field by field but put them into their own sub object.
This is also what you would need to do if you used a REST API with a schema. Or not?
Just an idea: If you want to make authentication restrictions more obvious, maybe you can use union types instead of optional fields? Similar to Either/Try in functional languages or Result Rust, you could either provide the result or the error that prevented it. Or instead of an error, a restricted view… Then clients having trouble see the reason for the trouble directly in their result but the schema gets more complicated.
If you cannot always return data, you have to make it optional. Most likely, you don’t have to do this field by field but put them into their own sub object.
This is also what you would need to do if you used a REST API with a schema. Or not?
Just an idea: If you want to make authentication restrictions more obvious, maybe you can use union types instead of optional fields? Similar to Either/Try in functional languages or Result Rust, you could either provide the result or the error that prevented it. Or instead of an error, a restricted view… Then clients having trouble see the reason for the trouble directly in their result but the schema gets more complicated.
Frankly, I would have rathered that auth errors were noisy and fatal, making developers immediately aware of issues before things get to production. The counterargument from my team, though, was that they did not want to provide information to malicious actors about unauthorized things. So the decision was to make auth errors return 200s, and poke null holes in the payload fields, and/or snip subtrees out of the payload.
Context: After dablling, I think that GraphQL is usually better than a REST API – and I want to find out if I am wrong. I am fine if it doesn’t solve all problems but it shouldn’t make things worse.
Indeed, it’s much the same for me. I don’t have any big personal problem with GraphQL. I just try to assess things objectively, and avoid getting pulled along with hype waves without reason. I just haven’t had an overall positive experience yet with GraphQL. It’s been plus-and-minus.
The counterargument from my team, though, was that they did not want to provide information to malicious actors about unauthorized things.
It’s is hard to judge without your schema but that sounds slightly weird. An attacker gets potentially more information by what fields are missing than a generic auth denied. But even if that’s not true, sounds like security by obscurity to me. [But yes, I am frequently wrong, and maybe there is a good reason for this in your case ;)]
The counterargument from my team, though, was that they did not want to provide information to malicious actors about unauthorized things.
It’s is hard to judge without your schema but that sounds slightly weird. An attacker gets potentially more information by what fields are missing than a generic auth denied. But even if that’s not true, sounds like security by obscurity to me. [But yes, I am frequently wrong, and maybe there is a good reason for this in your case ;)]
Your point stands, but the nuance here is that it’s more like: If someone doesn’t have access to X, if you tell them it’s missing, they can’t infer existence by trial-and-erroring and seeing what gives error vs. what gives absence.
Can you elaborate how you would solve the “graph” thing better with other API designs? E.g. in a REST design you would have the same potential path ways that you could use for querying? (Just in separate requests?)
I don’t know if it’s better, but what I do is this: I actually am pretty far from a REST purist, insofar as I do not strictly adhere to a 1:1 mapping between the object hierarchy and the REST endpoints. Instead, I prefer steering towards having endpoints 1:1 with frontend pages (for GETs, at least). So, where there are multiple pathways, my endpoints give only the information needed for the page. For the previous example, a Reader needing Articles would only get an array of Articles back, and would not have any Folders or Authors come along for the ride in the returned payload. If the page needs to display, say, an Author name, then it would come in the [same] payload, too, except as an extra field right on the Articles in the payload.
I like the simplicity and straightforwardness of “1:1 with frontend pages”, even though it causes criss-crossing and mixing of models/types in a given payload. I leave the entity relationship strictness to the model/DB level (has-many, belongs-to, etc.).
Python: the size of the codebase and number of moving parts has reached a point where the lack of static typing has become the main source of programmer errors in the code. There are type annotations now, but they don’t work very well IMO, are not used by most of our dependencies, and would be almost as much to retrofit onto our codebase as switching to a type-safe language would be. The performance of the Python VM is also noticably bad. We could try PyPy, but again… we’re investing a lot of effort just to stick to a language which has repeatedly proven itself poorly suited to our problem. The asyncio ecosystem helps but it’s still in its infancy and we’d have to rewrite almost everything to take advantage of it. And again, if we’re going to rewrite it… might as well re-evaluate our other choices while we’re at it.
Flask: it’s pretty decent, and not the main source of our grief (though it is somewhat annoying). My main feedback for Flask would be that it tries to do just a little bit too much. I wish it was a little bit more toolkit-oriented in its design and a more faithful expression of HTTP as a library.
SQLAlchemy: this is now my least favorite dependency in our entire stack. It’s… so bad. I just want to write SQL queries now. The database is the primary bottleneck in our application, and hand-optimizing our SQL queries is always the best route to performance improvements. Some basic stuff is possible with SQLAlchemy, simple shit like being smart about your joins and indicies, but taking advantage of PostgreSQL features is a pain. It’s a bad ORM - I’m constantly fighting with it to just do the shit I want it to and stop dicking around - and it’s a bad database abstraction layer - it’s too far removed from Postgres to get anything more than the basics done without a significant amount of grief and misery. Alembic is also constantly annoying. Many of the important improvements I want to do for performance and reliability are blocked by ditching these two dependencies.
Another problem child that I want to move away from is Celery. It just isn’t flexible enough to handle most of the things I want to do, and we have to use it for anything which needs to be done asyncronously from the main request handling flow. In Go it’s a lot easier to deal with such things. Go also allows me to get a bit closer to the underlying system, with direct access to syscalls and such*, which is something that I’ve desired on a few occasions.
For the record, the new system is not without its flaws and trade-offs. Go is not a perfect tool, nor GraphQL. But, they fit better into the design I want. This was almost a year of research in the making. The Python codebase has served us well, and will continue to be useful for some time to come, in that it (1) helped us understand the scope necessary to accomplish our goals, and (2) provided a usable platform quickly. Nothing quite beats Python for quickly and easily building a working prototype, and it generally does what you tell it to, in very few lines of code. But, its weaknesses have become more and more apparent over time.
* Almost. The runtime still gets on my nerves all the time and is still frustratingly limiting in this respect.
Thanks for responding. I think static typing in Python works really well once configured so I’m surprised to hear you say that. I think it’s better than the static typing in most other languages because generics are decent and the inference is pretty reasonable. For example it seems better thought out than Java, C and (in my limited experience) Go. My rough feeling is that 75% of the Python ecosystem either has type annotations or has type stubs in typeshed. Where something particularly important is untyped, I tend to just wrap it and give it an explicit annotation (this is fairly rare). I’ve written some tips on getting mypy working well on bigger projects.
I don’t think you have the right intuition that asyncio would help you if your problem is speed. I pretty convinced that asyncio is in fact slower than normal Python in most cases (and am currently writing another blogpost about that - UWSGI is for sure the fastest and most robust way to run a python webservice). Asyncio stuff tends to fail in weird ways under load. I also think asyncio is a big problem for correctness - it actually seems quite hard to get asyncio programs right and there are a lot of footguns around.
Re: SQLAlchemy - I’m also very surprised. I think SQLAlchemy is a good ORM and I’ve used postgres specific features (arrays, json, user defined functions, etc) from it a great deal. If you want to write SQL-level code there is nothing stopping you from using the “core” layer rather than the “ORM” layer. There’s also nothing stopping you using SQL strings with the parameterisation, ie "select col_a from table where col_b = :something - I do that sometimes too. I have to say I have never had trouble with hand optimising a SQL query in SQLA - ever - because it gives you direct control over the query (this is even true at the ORM level). One problem I have run into is where people decide to use SQLA orm objects as their domain objects and…that doesn’t end happily.
Celery however is something that I do think is quite limited. It’s really just a task queue. I am not sure that firing off background tasks as goroutines is a full replacement though as you typically need to handle errors, retry, record what happened, etc. I think even if you were using go every serious system ends up with a messaging subsystem inside it - at least for background tasks. People do not usually send emails from their webserving processes. Perhaps the libraries for this in go land are better but in Python I don’t think there is a library that gets this kind of thing wholly right. I am working on my own thing but it’s too early to recommend it to anyone (missive). I want to work on it more but childcare responsibilities are getting in the way! :)
Best of luck in your rewrite/rework. I have not been impressed with GraphQL so far but I haven’t used the library you’re planning to use. My problems with GraphQL so far are that a) it isn’t amenable to many of the optimisations I want to do with it b) neither schema first nor code first really work that well and c) it’s query language is much more limited than it looks - much less expressive than I would like. You may not find that the grass is greener!
I don’t think you have the right intuition that asyncio would help you if your problem is speed.
I don’t want asyncio for speed, I want it for a better organizational model of handling the various needs of the application concurrently. With Flask, it’s request in, request out, and that’s all you get. I would hope that asyncio would improve the ability to handle long-running requests while still servicing fast requests, and also somewhat mitigate the need for Celery. But still, I’ve more or less resigned from Python at this point, so it’s a moot point.
I am not sure that firing off background tasks as goroutines is a full replacement though as you typically need to handle errors, retry, record what happened, etc.
Agreed. This is not completely thought-out yet, and I don’t expect the solution to be as straightforward as fire-and-forget.
My problems with GraphQL so far are that a) it isn’t amenable to many of the optimisations I want to do with it b) neither schema first nor code first really work that well and c) it’s query language is much more limited than it looks - much less expressive than I would like.
I have encountered and evaluated all of the same problems, and still decided to use GraphQL. I am satisfied with the solutions to (a) and (b) presented by the library I chose, and I feel comfortable building a good API within the constraints of (c). Cheers!
So do you plan to keep the web UI in Python using Flask, and have it talk to a Go-based GraphQL API server? Or do you plan to eventually rewrite the web UI in Go as well? If the latter, is there a particular Go web framework or set of libraries that you like, or just the standard library?
To be determined. The problems of Python and Flask become much less severe if it’s a frontend for GraphQL, and it will be less work to adapt them as such. I intend to conduct more research to see if this path is wise, and also probably do an experiment with a new Golang-based implementation. I am not sure how that would look, yet, either.
It’s also possible that both may happen, that we do a quick overhaul of the Python code to talk to GraphQL instead of SQL, and then over time do another incremental rewrite into another language.
I’m curious about why you consider that Flask does “a little bit too much”. It’s a very lightweight framework, and the only “batteries included” thing I can think of is the usage of Jinja for template rendering. But if I’m not wrong, sourcehut uses it a lot so I don’t thing this is what annoys you.
Regarding SQLAlchemy, I totally agree with you. It’s a bad database abstraction layer. When you try to make simple queries it becomes cumbersome because of SQLAlchemy’s supposed low level abstractions. But when you want to make a fine-grained query it’s also a real pain and you end up writing raw SQL because it’s easier. In some cases you can embed some raw SQL fragment inside the ORM query, but it is often not the case (for example, here is a crappy piece of code I’m partially responsible of). Not having a decent framework-agnostic ORM is the only thing that makes me miss Django :(
Regarding Flask, I recently saw Daniel Stone give a talk wherein he reflected on the success of wlroots compared to the relative failure of libweston, and chalked it up to the difference between a toolkit and a midlayer, where wlroots is the former. Flask is a midlayer. It does its thing, and provides you a little place to nestle your application into. But, if you want to change any of its behavior - routing, session storage, and so on - you’re plugging into the rails its laid down for you. A toolkit approach would instead have the programmer always be in control, and reach for the tools it needs - routing, templating, session management, and so on - as they need them.
I’ve personally found falcon a bit nicer to work with than flask, as an api/component.
That said, as a daily user for some mid-sized codebases (some 56k odd lines of code), I very much agree with what you said about python and sqlalchemy.
I find that linked piece of code perplexing because converting that from string-concat-based dynamic SQL into SQLA core looks straightforward: pull out the subqueries, turn them into python level variables and then join it all up in a single big query at the end. That would also save you from having a switch for sqlite in the middle of it - SQLA core would handle that.
SQLAlchemy: this is now my least favorite dependency in our entire stack. It’s… so bad
That’s also the only thing I remember about it from when I used it years ago. Maybe it’s something everyone has to go through once to figure out the extra layer might look tasty, but in the end it only gives you stomach ages.
Yeah, I’d be very interested to hear more about that too. Not that I disagree, but I think his article was light on details. What were the things that “soured” his view of Python for larger projects, and why was he “unsatisfied with the results” of REST?
I found REST difficult to build a consistent representation of our services with, and it does a poor job of representing the relationship between resources. After all, GraphQL describes a graph, but REST describes a tree. GraphQL also benefits a lot from static typing and an explicit schema defined in advance.
Also, our new codebase for GraphQL utilizes the database more efficiently, which is the main bottleneck in the previous implementation. We could apply similar techniques, but it would require lots of refactoring and SQLAlchemy only ever gets in the way.
Ive been using Flask and Gunicorn. I basically do native dev before porting it to web app. My native apps are heavily decomposed into functions. One thing that’s weird is they break when I use them in web setup. The functions will be defined before “@app” or whatever it is like in a native app. Then, Gunicorn or Flask tells me the function is undefined or doesn’t exist.
I don’t know why that happens. It made me un-decompose those apps to just dump all the code in the main function. Also, I try to do everything I can outside the web app with it just using a database or something. My Flask apps have stayed tiny and working but probably nearing the limit on that.
Some important points about GraphQL not mentioned in this article :
GraphQL is vulnerable to DoS attacks. Not only you have to do parsing, but if you want to throttle queries, you have to evaluate a query’s performance impact on the database dynamically, so you can limit how much malicious queries impact your server. None of the open source libraries do this (as far as I am aware).
As mentioned in the comments, if you want pagination your types will be polluted by wrapper types.
Authorization is a pain and most libraries also don’t support it per-object.
GraphQL requests don’t cache and are impossible to optimize.
All of these fall under the problem I mentioned in the article: “The quality of server implementations has been rather poor on each of my research attempts, especially outside of JavaScript implementations.” However, they’re also all solved by the new server-side implementation I mentioned using, gqlgen. It has good tools for estimating complexity and query introspection, and tags you can decorate each path with to have fine-grained access controls. It can also cache requests by computing their hash, which can be done either client-side or server-side. Some novel optimizations are possible with GraphQL, and would have been more difficult with Python+Flask. Consider reading through our git.sr.ht GraphQL implementation and our shared GraphQL code to see some of this in action.
These are all problems that aren’t inherent to GraphQL, but were rather symptoms of the pooir quality of server-side implementations which have been available for most of GraphQL’s lifetime (and were the reason why, on multiple occasions in the past, I discarded GraphQL as an untenable solution to our problems).
It doesn’t provide anything special for pagination, but I came up with a design which I am reasonably happy with. This is one of the areas that I think GraphQL really ought to have solved better, and I expect to be done better by The Next Thing.
I’ve always thought that SPARQL is way better than GraphQL at operating with graphs (simpler, way more composable). However, it’s worse on mappings to a relational database and in doing everything else that GraphQL can do (RPC)
This was a good read for me as I’m all-in on GraphQL. My stack is heavy on Javascript/TypeScript and as assumed in the article, I have excellent experiences with it.
Reading the comments here concerned me though. While my projects are small right now, the possibility of being DDoS’d never occurred to me, among other things. I’ve got a lot of research to do in the AM.
Originally I choose to use REST for the API, because REST is boring and well-understood. Boring, well-understood approaches are the bread and butter of SourceHut. However, I have been very unsatisfied with the results, and have been unwilling to take this design forward into the beta.
Personally I would be interested in hearing more about this. I don’t recall seeing why Drew feels a REST approach is unsatisfactory.
There is a recurring theme that I pick up: Python is a great language to start a project in. But then, later, it is decided to rewrite it in something
with more static typing and
faster.
I wonder if there are exceptions where either people
just stay happy with Python even in big projects or
even migrate from a statically typed language to Python.
Neither the absence nor the presence of these exceptions would really prove anything but it would be hint at what teams in larger projects think in general about using Python. I’d also be interested in stories like that for other less statically typed languages (I think it is a gradient).
I kinda think all codebases feel like they’re rotting after about 5 years, regardless of language. C and C++ codebases are like that too …
edit: Just read over the slides, it was a good read. Interestingly they talk about the performance of goroutines vs. async with Twisted. I’m not a Go user but I definitely like the blocking / goroutine style vs async. It sounds like they were experts in Twisted, which is relatively rare. (But yes using the boring tool you know is good)
Different tools for different applications… every language is good for something, and every language sucks for something, but it’s hard to tell that ahead of time :)
Thank you for the example, it was an interesting read. Especially that Python with pypy is more memory efficient than go is a thing that I wouldn’t have considered.
It seemed to be a small code base, though, since they reimplemented it in four days.
Some of the worst codebases I have ever encountered used Twisted. I shudder in horror at the memory. oof.
To this day I have a strong personal aversion (undoubtedly an unreasonable one!) to it.
I’m not meaning to dissuade the SourceHut developer(s) from using GraphQL, but I am still on the fence about GraphQL, having used it at
$employer
for, oh, 4-ish months now. GraphQL is a shift in thinking, and there are sometimes several ways to provide data for a given business need.In any system with models and DB tables, some tables/classes/types have more than one link to them (in an ERD). For example, an Article could have an Author, but it could also be in a MagazineIssue. When you have multiple pathways to reach a given entity, you (as the developer of a given use case, web page, whatever) have to decide which path you’re going to code in your query. Carrying on with the example schema: It could be
currentUser
->articlesWritten
; or it could becurrentUser
->magazinesSubscribedTo
->issues
->articles
. This kind of multiplicity of paths increases the complexity of the system, and makes the system more challenging to work with.Authorization with GraphQL is quite non-trivial. Suppose you had: Authors organize with Folders, in which they put Articles; and then Readers read Articles. A simple schema would be to have this heirarchy: authors -> folders -> articles. Folders would have-many articles, and articles would belong-to folders. But if you have a query that needs to get articles for a reader to read, if you stick with this simple schema, then you have to include folders in the query. But readers should not have access to or knowledge about the folder the author has put the article in. Or maybe the reader doesn’t care about who wrote what, and wants to get articles on a given topic or tag, like “politics”. Worse yet: a reader [UI] should not be able to query “upstream”, and get at
article
->folder
->author
->author.email
. “So don’t put upstream association links in your schema” you might say. But is it so simple? What if a pair of entities are not in a simple, obvious hierarchy where one is “higher” than the other. Say, a Person model with a friendsWith association.I grant that there may be some correct way(s) to go about things to deal with issues like the above, but GraphQL doesn’t come with guardrails to prevent problematic schemas, and I think it’s not only possible, but somewhat likely for a team new to GraphQL to plant footguns and landmines in parts of their schema.
So, yeah. I’m still on the fence about this GraphQL thing.
Indeed. At GitHub we had a VP who was All About GraphQL – we GraphQL’d everything, over a period of a couple years. By the time I left, there were RFCs circulating about how we’d undo all the damage done :|
Could you describe some of the damage which had been done? What did the push for GraphQL break?
I’m curious about this as well.
Thanks for these comments. I would appreciate more specific feedback, if you have the time to provide some. You can use our equivalent of the GraphQL playground here:
https://git.sr.ht/graphql
Expand the text at the bottom to see the whole schema for git.sr.ht. This went through many design revisions, and I’m personally pretty satisfied with the results.
Also, specifically on the subject of authorization, the library we’re using provides good primitives for locking down access to specific paths under specific conditions, and combined with a re-application of some techniques we developed in Python, this is basically a non-problem for us. I’m still working on an improved approach to authentication, but none of the open questions have anything to do with GraphQL.
Could I ask about this part?
Which additional problems would you like to see solved? In which direction might the ultimate answer lie?
I don’t think it solves pagination well. It should have first-class primitives for this. That’s the main issue.
Same here, except we’ve had lackluster success with GraphQL for 18-ish months instead of 4.
I could see it being a big step up for data models that are oriented primarily around a graph, but I’m having a really hard time seeing how sourcehut’s would work this way, outside of the git commits themselves. I’d be interested in reading more about the dissatisfaction with REST, because everything we’ve done so far with GraphQL could have been done better with REST along with something like JSON-schema.
For small projects, I’ve started just using json-rpc (what’s old is new again!) with a well defined schema (with validation). Nice side effect is the schema gets used as part of the documentation generation.
Saves lots of time noodling around with REST. I probably wouldn’t recommend it for large projects though, as it is a bit harder to version apis (a bit more “client fragile”).
I’ve used GraphQL on a few projects, and I thought it was ok. It certainly has its own set of problems.
Caching and pagination are also issues I’ve seen with graphql, as well as many many failures around n+1 queries.
About the
n+1
problem:Disclaimer: I have only dabbled with graphql and never used it professionally.
Let’s compare this to a REST API (or what would be a better reference point?). In a REST API, you also have the
n+1
problem if your entities refer to other entities.It is just more common to solve this on the client rather than on the server – which is worse due to higher roundtrips. Alternatively, you provide a denormalized view in your API that provides all data for, let’s say a screen or page, on the server. That means that you have to change your server in lockstep with the client – and also worry about versioning. In the server code you can either use a generalized solution (much like with GraphQL) or a specialized version because it only has to work with this one query. In my opinion, a generalized version is also usually the preferred option but it is nice to have the option for hand-optimization.
Therefore I think that solving the
n+1
problem in GraphQL requires some thought - but also in a REST API. There are good enough solutions. On one hand, since you have more information in your schema, you actually have some more options than in REST. On the other hand, REST is more constrained so you can work with more specialized solutions.Am I missing something important?
Pagination is not really a problem, you just have to make a type specifically for pagination.
So for an Article you would have an ArticleConnection containing… yeah just use REST it’s way more sane.
One thing to consider is that “just have to make a type” relies on the fact that GraphQL fields can have parameters. This introduces a challenge for caching, since those parameters become part of your cache key. If you’re trying to take advantage of a normalizing graph cache on your client, the cache engine needs to be aware of the pagination primitives.
Not advocating for REST, since you have a similar problem there. However, REST makes some static decisions about what fields are included and so the cache key is reasonably a URL instead of a GraphQL AST + variables map. If you decide you don’t want a normalizing cache, but do want a naive document cache ah la REST, managed by the browser, you can’t get that unless you have named queries stored on the server. Then you can query for /query/someName?x=1&y=2 sort of thing.
I guess my point is that nothing with GraphQL is “just” anything. You’re discarding a lot of builtin stuff from the browser and have to recreate it yourself through cooperation of client JavaScript and specialized server code. Whether or not that’s worthwhile really depends on what you’re building.
Why is this a bad thing? Presumably you always fetch articles using the same ArticleResolver, this doesn’t take any extra effort to support on the backend. And the multiplicity of paths is the nature of graphs! I don’t see why it makes the front-end more complicated.
The best advice I’ve seen for this is: don’t do authorization in GraphQL. You have some system which accepts and resolves GraphQL queries, it fetches data from other systems, those other systems should be doing the authorization. I haven’t had a chance to apply this advice in practice, I think that for some use-cases postgres row-level security would do wonders here, but it seems reasonable to me and it would solve most of your complaints. If your GraphQL resolve can’t even see
author.email
, then it’s no problem at all to include that link in your schema, it’ll only resolve if the user is allowed to see that email.“just not resolve”: But how does that actually play out in practice?
Your query which requested 40 fields in a hierarchical structure gets 34 values back – and just has to deal with it. Your frontend might have Typescript types corresponding to some models in your system. So then some places, which query under certain contexts, get 38 values back, other places get 34, and so on. So does your Typescript now just have to have everything marked with a
?
suffix (not required,null
is permissible)? Suppose you have an Angular component which takes an Article type prop, or gets it from NgRX, or wherever. Now, your component cannot just populate its HTML template with all known fields of the Article type, like author name, article title, date published, etc. Sometimes that data will just not be there, “because auth”. So will it have to have*ngIf
all over the place? Or just have{{ someField }}
interpolations peppered throughout that just render as a blank space – sometimes? Should you have a different Angular component for each context that this type appears?And this is all just talking about a simple single defined GraphQL type and its fields. Things get even more fun when you have 3+ levels of hierarchy, and arrays of results. What if the model hierarchy is A -> B -> C, and your
currentUser
has access to the queried A, and the queried C, but not the queried B? Or, because of lack of authorization, it only has access to a subset of the array of values of a given field. So, your frontend now displays a partial set of results – without writing any error in your error log. Or maybe an auth error on an array field makes it return no records at all. Or maybe anull
instead of[]
. So Engineer Alice is expecting to see 10 records, just like she saw on Engineer Bob’s screen when they were pairing yesterday. But today, when she goes to work in her branch, she only sees 7 – and cannot easily see why. “Well, your devs are supposed to know your (whole) schema”, we might say. That’s ideal, but in a team beyond a certain size, that can’t be assured (and, beyond a certain size after that, shouldn’t be expected).“Okay, so don’t make auth problems silent, make them noisy” we might say. Yes, I agree. So how noisy should these be? Should a single auth problem, in one single node in a 7-level hierarchy of 82 fields in a GraphQL query cause the entire query to return non-200? Teams have to decide about that, and some teams won’t agree to that approach. For example, at
$current_employer
, our (non-unanimous) agreement at the moment is to write to error logs, dispatch to$third_party_error_service
, but returnnull
and return success (200) for the query. But, in so doing, we are still having some engineers come into the Slack channel(s) and reach out for help because they don’t understand why their GraphQL isn’t working any more, or why such and such page used to return whatever yesterday, but is returning something else today.I’m just trying to point out that GraphQL isn’t problem-free. There are warts and challenges that need to be acknowledged and addressed before success can be had.
I put my comment out there knowing it was wrong and hoping someone would come along to tell me what I was wrong about, but I don’t think this particular criticism holds water.
You’re pointing out that if the frontend tries to access data it may or not be authorized to see it… may or not see it. The ? is essential complexity that is no fault of GraphQL, REST will have the same problem.
I think you’re arguing that:
User.email
is nullable then engineers usingUser
should expect that there will sometimes not be an email. Going further, you might want to make some types self-documenting. If it’s expected they’ll often fail, due to lack of authorization, you might return a union type:ViewableUser | CloakedUser
.There’s another technique you might use, which is to return the errors inline with the query. Return both the data and also the errors. If some field fails to resolve you might leave it null and also bundle an error in with the response explaining why the field is null.
Which of these techniques you go for will depend on the situation, and probably there are some situations where none of them are satisfying, but I don’t think it’s reasonable to expect any technology to be problem-free. You’re holding GraphQL up to a very high standard here!
Context: After dablling, I think that GraphQL is usually better than a REST API – and I want to find out if I am wrong. I am fine if it doesn’t solve all problems but it shouldn’t make things worse.
Back to your comment: I think, it is normal that your data schema has to take authorization constraints into consideration. Agreed, that does not make it always easy.
If you cannot always return data, you have to make it optional. Most likely, you don’t have to do this field by field but put them into their own sub object.
This is also what you would need to do if you used a REST API with a schema. Or not?
Just an idea: If you want to make authentication restrictions more obvious, maybe you can use union types instead of optional fields? Similar to
Either
/Try
in functional languages orResult
Rust, you could either provide the result or the error that prevented it. Or instead of an error, a restricted view… Then clients having trouble see the reason for the trouble directly in their result but the schema gets more complicated.Frankly, I would have rathered that auth errors were noisy and fatal, making developers immediately aware of issues before things get to production. The counterargument from my team, though, was that they did not want to provide information to malicious actors about unauthorized things. So the decision was to make auth errors return 200s, and poke
null
holes in the payload fields, and/or snip subtrees out of the payload.Indeed, it’s much the same for me. I don’t have any big personal problem with GraphQL. I just try to assess things objectively, and avoid getting pulled along with hype waves without reason. I just haven’t had an overall positive experience yet with GraphQL. It’s been plus-and-minus.
Thank you for your thoughtful replies :)
It’s is hard to judge without your schema but that sounds slightly weird. An attacker gets potentially more information by what fields are missing than a generic auth denied. But even if that’s not true, sounds like security by obscurity to me. [But yes, I am frequently wrong, and maybe there is a good reason for this in your case ;)]
Your point stands, but the nuance here is that it’s more like: If someone doesn’t have access to X, if you tell them it’s missing, they can’t infer existence by trial-and-erroring and seeing what gives error vs. what gives absence.
Can you elaborate how you would solve the “graph” thing better with other API designs? E.g. in a REST design you would have the same potential path ways that you could use for querying? (Just in separate requests?)
I don’t know if it’s better, but what I do is this: I actually am pretty far from a REST purist, insofar as I do not strictly adhere to a 1:1 mapping between the object hierarchy and the REST endpoints. Instead, I prefer steering towards having endpoints 1:1 with frontend pages (for GETs, at least). So, where there are multiple pathways, my endpoints give only the information needed for the page. For the previous example, a Reader needing Articles would only get an array of Articles back, and would not have any Folders or Authors come along for the ride in the returned payload. If the page needs to display, say, an Author name, then it would come in the [same] payload, too, except as an extra field right on the Articles in the payload.
I like the simplicity and straightforwardness of “1:1 with frontend pages”, even though it causes criss-crossing and mixing of models/types in a given payload. I leave the entity relationship strictness to the model/DB level (has-many, belongs-to, etc.).
Thank you for elaborating. I have used that pattern and it worked fairly well for me, too.
You have to keep client/server in sync, though. I really like the flexibility of graphql here but I haven’t used it on a real life project.
What has been the problem with Python/Flask/SQLA?
Python: the size of the codebase and number of moving parts has reached a point where the lack of static typing has become the main source of programmer errors in the code. There are type annotations now, but they don’t work very well IMO, are not used by most of our dependencies, and would be almost as much to retrofit onto our codebase as switching to a type-safe language would be. The performance of the Python VM is also noticably bad. We could try PyPy, but again… we’re investing a lot of effort just to stick to a language which has repeatedly proven itself poorly suited to our problem. The asyncio ecosystem helps but it’s still in its infancy and we’d have to rewrite almost everything to take advantage of it. And again, if we’re going to rewrite it… might as well re-evaluate our other choices while we’re at it.
Flask: it’s pretty decent, and not the main source of our grief (though it is somewhat annoying). My main feedback for Flask would be that it tries to do just a little bit too much. I wish it was a little bit more toolkit-oriented in its design and a more faithful expression of HTTP as a library.
SQLAlchemy: this is now my least favorite dependency in our entire stack. It’s… so bad. I just want to write SQL queries now. The database is the primary bottleneck in our application, and hand-optimizing our SQL queries is always the best route to performance improvements. Some basic stuff is possible with SQLAlchemy, simple shit like being smart about your joins and indicies, but taking advantage of PostgreSQL features is a pain. It’s a bad ORM - I’m constantly fighting with it to just do the shit I want it to and stop dicking around - and it’s a bad database abstraction layer - it’s too far removed from Postgres to get anything more than the basics done without a significant amount of grief and misery. Alembic is also constantly annoying. Many of the important improvements I want to do for performance and reliability are blocked by ditching these two dependencies.
Another problem child that I want to move away from is Celery. It just isn’t flexible enough to handle most of the things I want to do, and we have to use it for anything which needs to be done asyncronously from the main request handling flow. In Go it’s a lot easier to deal with such things. Go also allows me to get a bit closer to the underlying system, with direct access to syscalls and such*, which is something that I’ve desired on a few occasions.
For the record, the new system is not without its flaws and trade-offs. Go is not a perfect tool, nor GraphQL. But, they fit better into the design I want. This was almost a year of research in the making. The Python codebase has served us well, and will continue to be useful for some time to come, in that it (1) helped us understand the scope necessary to accomplish our goals, and (2) provided a usable platform quickly. Nothing quite beats Python for quickly and easily building a working prototype, and it generally does what you tell it to, in very few lines of code. But, its weaknesses have become more and more apparent over time.
* Almost. The runtime still gets on my nerves all the time and is still frustratingly limiting in this respect.
Thanks for responding. I think static typing in Python works really well once configured so I’m surprised to hear you say that. I think it’s better than the static typing in most other languages because generics are decent and the inference is pretty reasonable. For example it seems better thought out than Java, C and (in my limited experience) Go. My rough feeling is that 75% of the Python ecosystem either has type annotations or has type stubs in typeshed. Where something particularly important is untyped, I tend to just wrap it and give it an explicit annotation (this is fairly rare). I’ve written some tips on getting mypy working well on bigger projects.
I don’t think you have the right intuition that asyncio would help you if your problem is speed. I pretty convinced that asyncio is in fact slower than normal Python in most cases (and am currently writing another blogpost about that - UWSGI is for sure the fastest and most robust way to run a python webservice). Asyncio stuff tends to fail in weird ways under load. I also think asyncio is a big problem for correctness - it actually seems quite hard to get asyncio programs right and there are a lot of footguns around.
Re: SQLAlchemy - I’m also very surprised. I think SQLAlchemy is a good ORM and I’ve used postgres specific features (arrays, json, user defined functions, etc) from it a great deal. If you want to write SQL-level code there is nothing stopping you from using the “core” layer rather than the “ORM” layer. There’s also nothing stopping you using SQL strings with the parameterisation, ie
"select col_a from table where col_b = :something
- I do that sometimes too. I have to say I have never had trouble with hand optimising a SQL query in SQLA - ever - because it gives you direct control over the query (this is even true at the ORM level). One problem I have run into is where people decide to use SQLA orm objects as their domain objects and…that doesn’t end happily.Celery however is something that I do think is quite limited. It’s really just a task queue. I am not sure that firing off background tasks as goroutines is a full replacement though as you typically need to handle errors, retry, record what happened, etc. I think even if you were using go every serious system ends up with a messaging subsystem inside it - at least for background tasks. People do not usually send emails from their webserving processes. Perhaps the libraries for this in go land are better but in Python I don’t think there is a library that gets this kind of thing wholly right. I am working on my own thing but it’s too early to recommend it to anyone (missive). I want to work on it more but childcare responsibilities are getting in the way! :)
Best of luck in your rewrite/rework. I have not been impressed with GraphQL so far but I haven’t used the library you’re planning to use. My problems with GraphQL so far are that a) it isn’t amenable to many of the optimisations I want to do with it b) neither schema first nor code first really work that well and c) it’s query language is much more limited than it looks - much less expressive than I would like. You may not find that the grass is greener!
I don’t want asyncio for speed, I want it for a better organizational model of handling the various needs of the application concurrently. With Flask, it’s request in, request out, and that’s all you get. I would hope that asyncio would improve the ability to handle long-running requests while still servicing fast requests, and also somewhat mitigate the need for Celery. But still, I’ve more or less resigned from Python at this point, so it’s a moot point.
Agreed. This is not completely thought-out yet, and I don’t expect the solution to be as straightforward as fire-and-forget.
I have encountered and evaluated all of the same problems, and still decided to use GraphQL. I am satisfied with the solutions to (a) and (b) presented by the library I chose, and I feel comfortable building a good API within the constraints of (c). Cheers!
So do you plan to keep the web UI in Python using Flask, and have it talk to a Go-based GraphQL API server? Or do you plan to eventually rewrite the web UI in Go as well? If the latter, is there a particular Go web framework or set of libraries that you like, or just the standard library?
To be determined. The problems of Python and Flask become much less severe if it’s a frontend for GraphQL, and it will be less work to adapt them as such. I intend to conduct more research to see if this path is wise, and also probably do an experiment with a new Golang-based implementation. I am not sure how that would look, yet, either.
It’s also possible that both may happen, that we do a quick overhaul of the Python code to talk to GraphQL instead of SQL, and then over time do another incremental rewrite into another language.
I’m curious about why you consider that Flask does “a little bit too much”. It’s a very lightweight framework, and the only “batteries included” thing I can think of is the usage of Jinja for template rendering. But if I’m not wrong, sourcehut uses it a lot so I don’t thing this is what annoys you.
Regarding SQLAlchemy, I totally agree with you. It’s a bad database abstraction layer. When you try to make simple queries it becomes cumbersome because of SQLAlchemy’s supposed low level abstractions. But when you want to make a fine-grained query it’s also a real pain and you end up writing raw SQL because it’s easier. In some cases you can embed some raw SQL fragment inside the ORM query, but it is often not the case (for example, here is a crappy piece of code I’m partially responsible of). Not having a decent framework-agnostic ORM is the only thing that makes me miss Django :(
Regarding Flask, I recently saw Daniel Stone give a talk wherein he reflected on the success of wlroots compared to the relative failure of libweston, and chalked it up to the difference between a toolkit and a midlayer, where wlroots is the former. Flask is a midlayer. It does its thing, and provides you a little place to nestle your application into. But, if you want to change any of its behavior - routing, session storage, and so on - you’re plugging into the rails its laid down for you. A toolkit approach would instead have the programmer always be in control, and reach for the tools it needs - routing, templating, session management, and so on - as they need them.
I’ve personally found falcon a bit nicer to work with than flask, as an api/component.
That said, as a daily user for some mid-sized codebases (some 56k odd lines of code), I very much agree with what you said about python and sqlalchemy.
I find that linked piece of code perplexing because converting that from string-concat-based dynamic SQL into SQLA core looks straightforward: pull out the subqueries, turn them into python level variables and then join it all up in a single big query at the end. That would also save you from having a switch for sqlite in the middle of it - SQLA core would handle that.
That’s also the only thing I remember about it from when I used it years ago. Maybe it’s something everyone has to go through once to figure out the extra layer might look tasty, but in the end it only gives you stomach ages.
[Comment from banned user removed]
Hi, I believe you meant to reply to https://lobste.rs/s/me5emr/how_why_graphql_will_influence_sourcehut#c_gozx0c
Thanks!
Yeah, I’d be very interested to hear more about that too. Not that I disagree, but I think his article was light on details. What were the things that “soured” his view of Python for larger projects, and why was he “unsatisfied with the results” of REST?
I found REST difficult to build a consistent representation of our services with, and it does a poor job of representing the relationship between resources. After all, GraphQL describes a graph, but REST describes a tree. GraphQL also benefits a lot from static typing and an explicit schema defined in advance.
Also, our new codebase for GraphQL utilizes the database more efficiently, which is the main bottleneck in the previous implementation. We could apply similar techniques, but it would require lots of refactoring and SQLAlchemy only ever gets in the way.
Ive been using Flask and Gunicorn. I basically do native dev before porting it to web app. My native apps are heavily decomposed into functions. One thing that’s weird is they break when I use them in web setup. The functions will be defined before “@app” or whatever it is like in a native app. Then, Gunicorn or Flask tells me the function is undefined or doesn’t exist.
I don’t know why that happens. It made me un-decompose those apps to just dump all the code in the main function. Also, I try to do everything I can outside the web app with it just using a database or something. My Flask apps have stayed tiny and working but probably nearing the limit on that.
Some important points about GraphQL not mentioned in this article :
All of these fall under the problem I mentioned in the article: “The quality of server implementations has been rather poor on each of my research attempts, especially outside of JavaScript implementations.” However, they’re also all solved by the new server-side implementation I mentioned using, gqlgen. It has good tools for estimating complexity and query introspection, and tags you can decorate each path with to have fine-grained access controls. It can also cache requests by computing their hash, which can be done either client-side or server-side. Some novel optimizations are possible with GraphQL, and would have been more difficult with Python+Flask. Consider reading through our git.sr.ht GraphQL implementation and our shared GraphQL code to see some of this in action.
These are all problems that aren’t inherent to GraphQL, but were rather symptoms of the pooir quality of server-side implementations which have been available for most of GraphQL’s lifetime (and were the reason why, on multiple occasions in the past, I discarded GraphQL as an untenable solution to our problems).
It doesn’t provide anything special for pagination, but I came up with a design which I am reasonably happy with. This is one of the areas that I think GraphQL really ought to have solved better, and I expect to be done better by The Next Thing.
I’ve always thought that SPARQL is way better than GraphQL at operating with graphs (simpler, way more composable). However, it’s worse on mappings to a relational database and in doing everything else that GraphQL can do (RPC)
This was a good read for me as I’m all-in on GraphQL. My stack is heavy on Javascript/TypeScript and as assumed in the article, I have excellent experiences with it.
Reading the comments here concerned me though. While my projects are small right now, the possibility of being DDoS’d never occurred to me, among other things. I’ve got a lot of research to do in the AM.
Personally I would be interested in hearing more about this. I don’t recall seeing why Drew feels a REST approach is unsatisfactory.
If backend will get completely separated from the fronted, it will open the door for custom FEs. Imagine SPA FE for sourcehut =)
Thank you Drew for making you thoughts public so that we can learn and discuss :)
There is a recurring theme that I pick up: Python is a great language to start a project in. But then, later, it is decided to rewrite it in something
I wonder if there are exceptions where either people
Neither the absence nor the presence of these exceptions would really prove anything but it would be hint at what teams in larger projects think in general about using Python. I’d also be interested in stories like that for other less statically typed languages (I think it is a gradient).
There are a lot of “Python to Go” stories out there on the web, but here’s one that’s different:
From Python to Go and Back Again
https://news.ycombinator.com/item?id=10402307
https://docs.google.com/presentation/d/1LO_WI3N-3p2Wp9PDWyv5B6EGFZ8XTOTNJ7Hd40WOUHo/mobilepresent?pli=1&slide=id.g70b0035b2_1_10
I kinda think all codebases feel like they’re rotting after about 5 years, regardless of language. C and C++ codebases are like that too …
edit: Just read over the slides, it was a good read. Interestingly they talk about the performance of goroutines vs. async with Twisted. I’m not a Go user but I definitely like the blocking / goroutine style vs async. It sounds like they were experts in Twisted, which is relatively rare. (But yes using the boring tool you know is good)
Different tools for different applications… every language is good for something, and every language sucks for something, but it’s hard to tell that ahead of time :)
Thank you for the example, it was an interesting read. Especially that Python with pypy is more memory efficient than go is a thing that I wouldn’t have considered.
It seemed to be a small code base, though, since they reimplemented it in four days.
Some of the worst codebases I have ever encountered used Twisted. I shudder in horror at the memory. oof.
To this day I have a strong personal aversion (undoubtedly an unreasonable one!) to it.