I disagree on the 404 rule. You should absolutely use 404 for both for missing resources and URLs your app doesn’t recognise. Additional context can be provided in the response body which is presumably a structured error from the Rule 10. HTTP is the transport protocol. It has its semantics. Implementations should not willy-nilly change it. The API semantics should be limited to the payload. That is the response body and maybe non-standard headers. That’s what your client application should interpret. If you’re having hard time separating the two imaging using a completely different transport. Say, substitute HTTP with snail mail. How much of HTTP do you need to reinvent to make your API work?
I think this excerpt from Fielding’s original dissertation where he described REST is pretty great
HTTP is not designed to be a transport protocol. It is a transfer protocol in which the messages reflect the semantics of the Web architecture by performing actions on resources through the transfer and manipulation of representations of those resources. It is possible to achieve a wide range of functionality using this very simple interface, but following the interface is required in order for HTTP semantics to remain visible to intermediaries.
Not only is diverging from the expected semantics of HTTP unusual , but it also presumes that all intermediaries are going to agree with your new twist. Another commenter very smartly commented that “status codes are for clients you don’t control,” which should include all the layers which may or may not be present on the public internet; proxies, caches, etc…
Additional context can be provided in the response body which is presumably a structured error from the Rule 10.
By that logic, why do we have different error-codes at all? Just use http codes 2, 3 and 4 and put custom error texts.
No, I agree with the rule. Don’t use 404 for two different meanings. In particular, since this is not a rare edge case, it’s a common problem in almost any api, so having two codes for “request-path not understood” and “request-path understood, result is not there” is absolutely a good idea. Just have to find the best way to do that under the constrainted http codes that are available.
We have HTTP codes for HTTP clients. Specifically for HTTP clients that don’t speak your API like HTTP proxies. For them 404 is “I don’t have it” and they probably know how to interpret it. They don’t care whether the URL is handled or not by an app on the server. For what it’s worth it can be a bunch of files on a disk. And then it would definitely be the same. The endpoint not handled is identical to a missing file/directory. The missing resource is identical to a missing file/directory.
Now, your API client might want to distinguish between the two and your server might provide the information. Good, you should do that within your API. Which means in the response payload, not by inventing your own semantics for existing protocols.
For example, we all settled on 8 bit bytes but it’s just a convention. In the early days we had all sorts of byte lengths and even now we occasionally build very specialised CPUs with bytes other than 8 bit long. This doesn’t mean you can decide to use, say 7 bits everywhere. You can try but you’re going to have a hard time.
Same goes for HTTP. You might be looking for edge cases of using 410 instead of 404 a bit longer than trying to go with 7 bit bytes but this decision is definitely not consequence-free.
I’d say you do. HTTP Semantics (RFC 9110) specifies registry for status codes and says a registration MUST contain specific information. This is in contrast with headers which also have a registry but their registration is not a MUST and is more of an informative nature.
One of the things that the registry specifies is how the response should be handled. For example, 404 is cacheable unless headers say otherwise. 400 is not. And I’m mentioning 400 because clients that don’t understand the status should treat it as x00 from the same class. So 460 is going to look like 400 to compliant clients. And Bad Request is probably further from 404 than what you intended.
Another thing 9110 also says (15.5. Client Error 4xx):
Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
So basically what I said previously: 404 Not Found + compatible payload explaining whether the API endpoint is not there or the resource is missing.
Thanks, I guess you are right. It’s a bit frustrating. Recently I was looking for what code to use in case of an expired token. I ended up using 498 which is not registered, so I’m violating the spec. But what good is a spec that doesn’t even cover such basic cases… makes me frustrated.
I assume it’s some sort of access token (e.g. API access token). If so, it looks a lot like 401 Unauthorized.
The 401 (Unauthorized) status code indicates that the request has not been applied because it lacks valid authentication credentials for the target resource.
9110 also very clearly states what should happen next:
The server generating a 401 response MUST send a WWW-Authenticate header field […] containing at least one challenge applicable to the target resource.
For example, OAuth defines its own auth-scheme for this header. It also defines a few parameters that allow exposing some information about the nature of the error so your client might not need to parse the payload.
Pretty good advice overall, but I do want to spotlight something
Rule #2 DON’T add unnecessary path segments
Good advice, and yet the GOOD example…
# GOOD
GET /v3/application/listings/{listing_id}
commits the cardinal sin of REST API versioning… the entirely useless version segment! What’s weird is that the author demonstrates an understanding of why this is bad elsewhere:
URLs are resource identifiers, not representations. Adding representation information to the URL means there’s no canonical URL for a ‘thing’. Clients may have trouble uniquely identifying ‘things’ by URL.
Putting version information into the resource name also breaks this principle.
Tell me what is the distinction between these resources:
Nothing whatsoever. You are communicating something about the representation within the identitfier, which is a violation of concerns.
Why would you change a resource identifier? Let’s count the ways
Adding a new resource: just add it
Removing an old resource: HTTP already gives you the tools to handle this
Moving a resource: ditto
Changing the format of your unique identifier: okay that might be legitimate, but there’s a better way.
Stripe’s approach has been very influential to me. Using ISO8601 date stamps as version numbers, pinning API consumers based on first request, compatibility layer that translate older API schema to new, all good things. The only issue I have is that they chose to use a custom Stripe-Version header, and I disagree that this is right.
This is fundamentally about having multiple, incompatible representations of the resource. How do we choose between representations? Content Negotiation.
This abstract definition of a resource enables key features of the Web architecture. First, it provides generality by encompassing many sources of information without artificially distinguishing them by type or implementation. Second, it allows late binding of the reference to a representation, enabling content negotiation to take place based on characteristics of the request. Finally, it allows an author to reference the concept rather than some singular representation of that concept, thus removing the need to change all existing links whenever the representation changes (assuming the author used the right identifier).
Encoding the API version into the resource name means that the client becomes solely responsible for requesting the representation, there is no opportunity for negotiation between client and server.
# BAD
GET /v3/application/listings/{listing_id}
Accept: application/json
# GOOD
GET /application/listings/{listing_id}
Accept: application/vnd.company.my-api+json; version=2023-11-01
In short: use http headers instead of the url to encode the version. Is that really that much better though? I thought that http headers shouldn’t change semantics, and changing the versions is a semantics change to me. But then I guess you can argue that the version is not part of the semantics when talking about http entities.
Tbh, it seems fairly academic to me. I’d rather stick to putting the version into the url and forcing the clients to assume that the same $listing_id under v1 and v2 might be the same entity, but might actually be different entities, forcing the clients to read my release notes.
In short: use http headers instead of the url to encode the version.
No. Use the Accept header specifically to encode the version, because what you are doing is content negotiation, and HTTP designates that header for that purpose.
I thought that http headers shouldn’t change semantics, and changing the versions is a semantics change to me.
If you use path segments to version your API, you have no way to distinguish a syntactical change from a semantic one, because you’re putting the information in the wrong place.
A semantic change should be reflected in the resource identity. A syntactic change should be reflected in the representation. These are separate concerns and should not be conflated.
Tbh, it seems fairly academic to me. I’d rather stick to putting the version into the url and forcing the clients to assume that the same $listing_id under v1 and v2 might be the same entity, but might actually be different entities, forcing the clients to read my release notes.
This perfectly encapsulates the problems I am critiquing:
you have placed yourself entirely at the whim of the client implementations
your API design does not communicate intent and relies instead on out-of-band information to understand
there’s no orderly process of deprecation and data migration
This creates a situation that makes breaking changes very challenging to release, which in turn makes your API design accumulate undesirable warts, which makes integrating with your API unpleasant.
This is why people come to hate HTTP APIs like this.
(I object to your characterization of this as a scholastic exercise! I went to community college. My strong opinions have been formed by years of practical experience integrating with bad APIs.)
I guess the reason I don’t like to use headers is because they are rarely used for those things. The Accept-header also seems to be used only for transparent compression. Another thing I don’t like is that headers are often not captured by logging tools for various reasons - so in case of a bug, it’ll be hard to reproduce it without knowing the headers if they impact the response.
In other words, both solutions now seem unsatisfying to me.
How do you like to handle paths that change between versions?
EDIT - I guess I’m asking where you handle the routing for v2023-10-31 vs v2023-11-1. I could imagine you could handle it at the routing layer by checking the media type parameter and sending it to different servers entirely, or you could handle it in the application layer.
That depends on the implementation, and the reasons for your version change. If it’s all just happening in one app, you can handle it there. If you’re transitioning to a new codebase, you could introduce a reverse-proxy. Basically, your choices aren’t different here from the path segment, you just have more flexibility in introducing version updates to represent incompatible changes piecemeal without making a drastic change on the client side.
Negotiating between different representations of the same resource is content negotiation, and the HTTP way of doing that is the Accept header, not something else. Don’t reinvent the wheel.
I think it gets nuanced pretty quickly… Is it the same resource if what is returned has different structure? How different does it need to be such that you consider it a different resource?
Also, changing headers is harder (though not that hard) for clients. Often making your API easier to use for others trumps doing it correctly. Changing v1 to v2 in the URL (which I have to specify always) is easier than having to change na Accept header which you’re now making me set, and I otherwise might not have to.
You are confusing resource and representation. The URI is the resource. The HTTP body that gets returned is the representation. They are separate concerns. Choosing between represenations is content negotiation.
Changing the URI to indicate a change in representation is the wrong place according to HTTP. It’s like putting file extensions in your URL instead of using media types. You can do it, but you’re misusing the tools that the protocol has given you.
I don’t think I am confused, just poorly explaining my point of view. If a resource that represents a user switches from containing a user ID that is an int to one that’s a UUID, are those the same resource? I am arguing they are not.
I agree that if that resource is sent back as JSON or serialized protobuf, then it is the same resource in different representations. However, I do not think it is an abuse of the HTTP protocol to assign different URIs for a resource that has evolved/changed as the service has. Using a version number in the URI to indicate this change of the resource seems logical, intuitive, and valid with respect to the protocol.
The ID field is special in this regard, because it most likely indicates a change of URI, and that is why this would be considered a new resource, not the change of representation.
Let’s alter your example to exclude this ambiguity. Let’s say that you’re making a breaking change in the structure of the user data from
Is this the same resource? Clearly, yes. You are just changing the representation. This is when I would increment the API Version. Changes that are strictly additive don’t break a contract, but changing an existing field that a client may depend on does. The thing you are communicating in the Version is that the contract has changed.
The URI of this record should not change. The relationships of this User to other things has not changed, the identity of the User has not changed, merely your representation of its data. Needing to alter an unrelated property such as the URI is a violation of concerns. URIs should only change when the identity of the resource changes.
This is the principle of encapsulation. Things that are independent should not have to change in tandem. That means there is a conflation of responsibilities happening.
EDIT: Oh I’d do dispatch where it made sense. Probably in the application unless different versions are because the entire application has to change for some reason (new provider or something). That way all the possible answers to your Api call are in the same place.
This has a lot of good advice. I’ve gone back and forth on the 404 thing. The conclusion I’m at now is that status codes are for the clients you don’t control. If you control the client, you can tell them to use 410 or whatever other code for “right URL, wrong resource.”
yes! what a great razor for simplifying the decision making process.
Not only is 404 the correct error according to the semantics of http as a transport, but the semantics of your particular application are a concern for the application not the transport. It takes an equal amount of time to convey to a client developer that 404 bodies should be checked for application/api specific concerns as it does to tell the client developer that you are going to use whacky http response codes in unexpected ways.
So your “do you control the client” question really cuts to the meat of the thing, if you control both sides of the comms then you can subvert the standards however you want because you aren’t beholden to any expectations but your own.
I was going to mention this also, but figured in would get lost in the wall of text. I’m a fan of RFC 7807, been doing this with my service for a while.
Besides the stated reasoning for string ID’s, using integer ID’s in JSON is a quick path towards very subtle bugs.
JSON does not have 64 bit integers, it only has doubles. Large 64 bit integer quantities will get mangled when they go over JSON unless you wrap them in strings.
While I don’t have an issue with transferring ids as strings, prefixing ids irks me a bit. It does make it easier to identify the ids on sight. However, it’s going to make ids bigger and slower to compare than an int or uuid. This will hurt query performance unless you hash the ids and compare on the hash.
I recognize that this is probably data engineer bias.
How do people feel about 404s when someone doesn’t have permission to view something? I tend to think it’s more correct, but also find it somewhat infuriating. It makes it harder for an attacker to understand a system, but also harder to support legitimate users who may need help.
There is plenty of room for disagreement, but I think the notion here is that REST APIs have two types of resources fundamentally; collections and objects. A collection can hold zero or more objects, so it is (in English at least) natural and correct to for the name of a collection to be the plural form of the type of object which it contains.
Just because you have requested a specific object from a collection does not change the fact that the collection itself holds zero or more objects.
I disagree on the 404 rule. You should absolutely use 404 for both for missing resources and URLs your app doesn’t recognise. Additional context can be provided in the response body which is presumably a structured error from the Rule 10. HTTP is the transport protocol. It has its semantics. Implementations should not willy-nilly change it. The API semantics should be limited to the payload. That is the response body and maybe non-standard headers. That’s what your client application should interpret. If you’re having hard time separating the two imaging using a completely different transport. Say, substitute HTTP with snail mail. How much of HTTP do you need to reinvent to make your API work?
I think this excerpt from Fielding’s original dissertation where he described REST is pretty great
Not only is diverging from the expected semantics of HTTP unusual , but it also presumes that all intermediaries are going to agree with your new twist. Another commenter very smartly commented that “status codes are for clients you don’t control,” which should include all the layers which may or may not be present on the public internet; proxies, caches, etc…
By that logic, why do we have different error-codes at all? Just use http codes 2, 3 and 4 and put custom error texts.
No, I agree with the rule. Don’t use 404 for two different meanings. In particular, since this is not a rare edge case, it’s a common problem in almost any api, so having two codes for “request-path not understood” and “request-path understood, result is not there” is absolutely a good idea. Just have to find the best way to do that under the constrainted http codes that are available.
We have HTTP codes for HTTP clients. Specifically for HTTP clients that don’t speak your API like HTTP proxies. For them 404 is “I don’t have it” and they probably know how to interpret it. They don’t care whether the URL is handled or not by an app on the server. For what it’s worth it can be a bunch of files on a disk. And then it would definitely be the same. The endpoint not handled is identical to a missing file/directory. The missing resource is identical to a missing file/directory.
Now, your API client might want to distinguish between the two and your server might provide the information. Good, you should do that within your API. Which means in the response payload, not by inventing your own semantics for existing protocols.
For example, we all settled on 8 bit bytes but it’s just a convention. In the early days we had all sorts of byte lengths and even now we occasionally build very specialised CPUs with bytes other than 8 bit long. This doesn’t mean you can decide to use, say 7 bits everywhere. You can try but you’re going to have a hard time.
Same goes for HTTP. You might be looking for edge cases of using 410 instead of 404 a bit longer than trying to go with 7 bit bytes but this decision is definitely not consequence-free.
So then, let’s assume I use 460 as error code.
I’d say you do. HTTP Semantics (RFC 9110) specifies registry for status codes and says a registration MUST contain specific information. This is in contrast with headers which also have a registry but their registration is not a MUST and is more of an informative nature.
One of the things that the registry specifies is how the response should be handled. For example, 404 is cacheable unless headers say otherwise. 400 is not. And I’m mentioning 400 because clients that don’t understand the status should treat it as x00 from the same class. So 460 is going to look like 400 to compliant clients. And Bad Request is probably further from 404 than what you intended.
Another thing 9110 also says (15.5. Client Error 4xx):
So basically what I said previously: 404 Not Found + compatible payload explaining whether the API endpoint is not there or the resource is missing.
Thanks, I guess you are right. It’s a bit frustrating. Recently I was looking for what code to use in case of an expired token. I ended up using 498 which is not registered, so I’m violating the spec. But what good is a spec that doesn’t even cover such basic cases… makes me frustrated.
I assume it’s some sort of access token (e.g. API access token). If so, it looks a lot like 401 Unauthorized.
9110 also very clearly states what should happen next:
For example, OAuth defines its own auth-scheme for this header. It also defines a few parameters that allow exposing some information about the nature of the error so your client might not need to parse the payload.
But if you’re using something else, you can use other schemes or invent your own like AWS did.
Pretty good advice overall, but I do want to spotlight something
Rule #2 DON’T add unnecessary path segmentsGood advice, and yet the GOOD example…
commits the cardinal sin of REST API versioning… the entirely useless version segment! What’s weird is that the author demonstrates an understanding of why this is bad elsewhere:
Putting version information into the resource name also breaks this principle.
Tell me what is the distinction between these resources:
Nothing whatsoever. You are communicating something about the representation within the identitfier, which is a violation of concerns.
Why would you change a resource identifier? Let’s count the ways
Stripe’s approach has been very influential to me. Using ISO8601 date stamps as version numbers, pinning API consumers based on first request, compatibility layer that translate older API schema to new, all good things. The only issue I have is that they chose to use a custom
Stripe-Version
header, and I disagree that this is right.This is fundamentally about having multiple, incompatible representations of the resource. How do we choose between representations? Content Negotiation.
Encoding the API version into the resource name means that the client becomes solely responsible for requesting the representation, there is no opportunity for negotiation between client and server.
The best way to version an API is via media type parameter.
In short: use http headers instead of the url to encode the version. Is that really that much better though? I thought that http headers shouldn’t change semantics, and changing the versions is a semantics change to me. But then I guess you can argue that the version is not part of the semantics when talking about http entities.
Tbh, it seems fairly academic to me. I’d rather stick to putting the version into the url and forcing the clients to assume that the same $listing_id under v1 and v2 might be the same entity, but might actually be different entities, forcing the clients to read my release notes.
No. Use the
Accept
header specifically to encode the version, because what you are doing is content negotiation, and HTTP designates that header for that purpose.If you use path segments to version your API, you have no way to distinguish a syntactical change from a semantic one, because you’re putting the information in the wrong place.
A semantic change should be reflected in the resource identity. A syntactic change should be reflected in the representation. These are separate concerns and should not be conflated.
This perfectly encapsulates the problems I am critiquing:
This creates a situation that makes breaking changes very challenging to release, which in turn makes your API design accumulate undesirable warts, which makes integrating with your API unpleasant.
This is why people come to hate HTTP APIs like this.
(I object to your characterization of this as a scholastic exercise! I went to community college. My strong opinions have been formed by years of practical experience integrating with bad APIs.)
Fair enough, you make a good point.
I guess the reason I don’t like to use headers is because they are rarely used for those things. The Accept-header also seems to be used only for transparent compression. Another thing I don’t like is that headers are often not captured by logging tools for various reasons - so in case of a bug, it’ll be hard to reproduce it without knowing the headers if they impact the response.
In other words, both solutions now seem unsatisfying to me.
How do you like to handle paths that change between versions?
EDIT - I guess I’m asking where you handle the routing for
v2023-10-31
vsv2023-11-1
. I could imagine you could handle it at the routing layer by checking the media type parameter and sending it to different servers entirely, or you could handle it in the application layer.That depends on the implementation, and the reasons for your version change. If it’s all just happening in one app, you can handle it there. If you’re transitioning to a new codebase, you could introduce a reverse-proxy. Basically, your choices aren’t different here from the path segment, you just have more flexibility in introducing version updates to represent incompatible changes piecemeal without making a drastic change on the client side.
You can use headers for that. X-API-VERSION
Negotiating between different representations of the same resource is content negotiation, and the HTTP way of doing that is the
Accept
header, not something else. Don’t reinvent the wheel.I think it gets nuanced pretty quickly… Is it the same resource if what is returned has different structure? How different does it need to be such that you consider it a different resource?
Also, changing headers is harder (though not that hard) for clients. Often making your API easier to use for others trumps doing it correctly. Changing v1 to v2 in the URL (which I have to specify always) is easier than having to change na Accept header which you’re now making me set, and I otherwise might not have to.
You are confusing resource and representation. The URI is the resource. The HTTP body that gets returned is the representation. They are separate concerns. Choosing between represenations is content negotiation.
Changing the URI to indicate a change in representation is the wrong place according to HTTP. It’s like putting file extensions in your URL instead of using media types. You can do it, but you’re misusing the tools that the protocol has given you.
I don’t think I am confused, just poorly explaining my point of view. If a resource that represents a user switches from containing a user ID that is an int to one that’s a UUID, are those the same resource? I am arguing they are not.
I agree that if that resource is sent back as JSON or serialized protobuf, then it is the same resource in different representations. However, I do not think it is an abuse of the HTTP protocol to assign different URIs for a resource that has evolved/changed as the service has. Using a version number in the URI to indicate this change of the resource seems logical, intuitive, and valid with respect to the protocol.
The ID field is special in this regard, because it most likely indicates a change of URI, and that is why this would be considered a new resource, not the change of representation.
Let’s alter your example to exclude this ambiguity. Let’s say that you’re making a breaking change in the structure of the user data from
and now you want to represent it as
Is this the same resource? Clearly, yes. You are just changing the representation. This is when I would increment the API Version. Changes that are strictly additive don’t break a contract, but changing an existing field that a client may depend on does. The thing you are communicating in the Version is that the contract has changed.
The URI of this record should not change. The relationships of this User to other things has not changed, the identity of the User has not changed, merely your representation of its data. Needing to alter an unrelated property such as the URI is a violation of concerns. URIs should only change when the identity of the resource changes.
This is the principle of encapsulation. Things that are independent should not have to change in tandem. That means there is a conflation of responsibilities happening.
Add the new path?
EDIT: Oh I’d do dispatch where it made sense. Probably in the application unless different versions are because the entire application has to change for some reason (new provider or something). That way all the possible answers to your Api call are in the same place.
http 308 response?
This has a lot of good advice. I’ve gone back and forth on the 404 thing. The conclusion I’m at now is that status codes are for the clients you don’t control. If you control the client, you can tell them to use 410 or whatever other code for “right URL, wrong resource.”
yes! what a great razor for simplifying the decision making process.
Not only is 404 the correct error according to the semantics of http as a transport, but the semantics of your particular application are a concern for the application not the transport. It takes an equal amount of time to convey to a client developer that 404 bodies should be checked for application/api specific concerns as it does to tell the client developer that you are going to use whacky http response codes in unexpected ways.
So your “do you control the client” question really cuts to the meat of the thing, if you control both sides of the comms then you can subvert the standards however you want because you aren’t beholden to any expectations but your own.
For representing errors I would propose using RFC 7807 “Problem Details for HTTP APIs”.
The standard itself is a quick read, and easy to implement.
I was going to mention this also, but figured in would get lost in the wall of text. I’m a fan of RFC 7807, been doing this with my service for a while.
Besides the stated reasoning for string ID’s, using integer ID’s in JSON is a quick path towards very subtle bugs.
JSON does not have 64 bit integers, it only has doubles. Large 64 bit integer quantities will get mangled when they go over JSON unless you wrap them in strings.
While I don’t have an issue with transferring ids as strings, prefixing ids irks me a bit. It does make it easier to identify the ids on sight. However, it’s going to make ids bigger and slower to compare than an int or uuid. This will hurt query performance unless you hash the ids and compare on the hash.
I recognize that this is probably data engineer bias.
How do people feel about 404s when someone doesn’t have permission to view something? I tend to think it’s more correct, but also find it somewhat infuriating. It makes it harder for an attacker to understand a system, but also harder to support legitimate users who may need help.
I liked the advice except for rule 1. If you have an endpoint for all products, then sure, specific products should be under /products/:id
But if you can’t list all products, then sure, keep the name /product/:id
It’s more about consistency than the plural name, and consistency is a different rule in the list.
There is plenty of room for disagreement, but I think the notion here is that REST APIs have two types of resources fundamentally; collections and objects. A collection can hold zero or more objects, so it is (in English at least) natural and correct to for the name of a collection to be the plural form of the type of object which it contains.
Just because you have requested a specific object from a collection does not change the fact that the collection itself holds zero or more objects.