“Human readability” is the main selling point in the description and the second of the four design goals, yet you chose to make every single JSON key an arcane three-letter abbreviation?
Most field names are fairly descriptive, (alg, key, sig, tmb, typ, d, x, pay) and those that are not (cad, czd) don’t refer to existing equivalents. We feel that cad is nearly as useful as canonical_digest, however cad is much more succinct. Small JSON field names keep messages small, and as a consequence Coze messages are frequently smaller than existing alternatives.
I have discussed with one of them privately, and from what I gathered this is more about versioning than actual agility. They mean to phase out obsolete primitives, not allow the kind of dangerous negotiations that could enable downgrade attacks.
I haven’t reviewed Coze thoroughly enough to have a strong opinion on whether they made the same mistakes JWT did on that front however. It seems to be simpler at least, so there should be less room for error.
I think there’s general agreement on good practice but the definition is a little murky and that’s the source of concern. I think
Bruce Schneier’s attitude is correct: “it’s vital that our systems be able
to easily swap in new algorithms when required.” We agree that upgrading applications to new algorithms should be foreseen by
developers who should be prepared for such events.
We define cryptographic agility as not overly coupling systems to single cryptographic primitives. Instead systems need to be designed to
support future swap outs quickly even if those systems
only ever deploy a single primitive at a time. For example, we assume that ES256 may need to be
changed out soon for post-quantum. It’s a good idea to design systems to be prepared
for that change even if they only support a single algorithm at a time.
That’s the fine line: “so that they can support multiple cryptographic primitives and algorithms at the same time” is not the same as “currently deploying multiple cryptographic primitives at the same time.” An agile system may use a single primitive. The important aspect of crypto agility is being able to change out algorithms quickly if the need arises.
From your blog (great post!), Coze works along the lines of “Versioned Protocols” where there is no negotiation or intermixing. (ES256 only ever deals with ES256, and ES256 is defined rigidly.)
I think there’s general agreement on good practice but the definition is a little murky and that’s the source of concern. I think Bruce Schneier’s attitude is correct: “it’s vital that our systems be able to easily swap in new algorithms when required.” We agree that upgrading applications to new algorithms should be foreseen by developers who should be prepared for such events.
To be clear: Versioned Protocols meet the goals of Schneier’s quote. In-band negotiation does too, but comes with a lot of security foot-guns.
By having an algorithm header that isn’t “v1”, “v2”, etc. you’re opening yourself to an unnecessary amount of attack surface.
For example: ECDSA doesn’t provide exclusive ownership. This means that an attacker can generate two signatures such that:
sk1 != sk2
pk1 != pk2
sig1 := sign(sk1, msg1)
sig2 := sign(sk2, msg2)
sig1 == sig2
This property matters a lot if you either:
Depend on the uniqueness of your signatures
For example, if you’re populating a hash table based on the signatures in some weird cryptocurrency scheme
Support multiple public keys in the same context
I suspect the second condition is relevant for ES256.
If you wanted to fix this, you could do it in a version 2 of the protocol (prefix the message with the public key in the hash being signed). But that means you have to describe a new algorithm (e.g. ES256v2) and an attacker may be able to downgrade between the two.
But that means you have to describe a new algorithm (e.g. ES256v2) and an attacker may be able to downgrade between the two.
There’s a more general argument here: let’s say I’m making an awesome protocol or file format, and mark it “v1”. Put it all in production, happy users and all… and then you come in and point out a serious weakness in the format, forcing me to upgrade it to “v2”. Oops.
The problem is, transitioning from one to the other will take some time. Ideally just days, but depending on the context one might have to support both formats for weeks, months, or even years. During that period, recipients and responders need to accept this “v1” format, effectively allowing downgrade attacks.
The only safe way to handle this I know of (at least in the general case where you can’t implement specific, backward compatible mitigations), is to break users overnight, and force all providers to upgrade now. Coordinated release, CVE style.
Fair enough, I guess I should have re-read your blog post before I asked. My problem was that you were saying there’s a difference between “v1”, “v2”, and “ALG”, “ALG_v2”. In the case of Coze I don’t believe there is: it’s just a version number that happens to be descriptive.
For instance the potential weakness you describe would still be there if they replaced “ES256” by “v1.0.0”, and the same rollout steps would be needed. The increased attack surface doesn’t come from the naming, it comes from the simultaneous use of different algorithms. Which may be encouraged by the descriptive version name, but to be honest I don’t anticipate more than a marginal effect.
My problem was that you were saying there’s a difference between “v1”, “v2”, and “ALG”, “ALG_v2”. In the case of Coze I don’t believe there is: it’s just a version number that happens to be descriptive.
No, the difference is between the following choices:
("v1", "v2", "v3", ...)
("ES256", "HS256", "ES256v2", ...)
The ES256/HS256 mode has been known to break JWT. I also believe that tolerating both ES256 and ES256v2 in the same context is risky, but a necessary temporary risk for making migrations possible.
Your protocol should be easily upgradable to use newer cryptography in response to new attacks. But these upgrades should be defined by cryptographers, not users.
Has it? I’ve just read your link, and it seems to me the real culprit is the downgrade attack to “trust me bro” (the none algorithm), not ES256 itself. I’m also not aware of anything like the none algorithm in Coze, maybe @Zamicol could confirming this.
Regardless of the specifics, it clearly seems to me that your objections do not apply to descriptive names in general. You’re arguing against a specific algorithm. And that’s great, thank you for the feedback. I’m a big fan of rooting out bad algorithms before going to production.
Your protocol should be easily upgradable to use newer cryptography in response to new attacks.
Sure. Version numbers can do that. And so can descriptive names. The only qualm I would have here is that descriptive names without a version number would make it harder to know which is the latest.
But these upgrades should be defined by cryptographers, not users.
Sure. And from what I gathered those upgrades will be defined by the Coze devs, so that should cover it.
And again, I see little difference between version numbers and descriptive names. Granted, descriptive names are harder to sort, but that’s only relevant if implementers provide several algorithms and don’t set a clear default.
Don’t get me wrong, I’m not a fan of descriptive names as version numbers. I just don’t think they’re nearly as bad as you are suggesting.
Has it? I’ve just read your link, and it seems to me the real culprit is the downgrade attack to “trust me bro” (the none algorithm), not ES256 itself.
The RS256 to HS256 algorithm confusion attack also works for ECDSA keys.
Ah, I see. Didn’t read far enough. The presence of both an asymmetric signing and symmetric authentication may confuse implementers or users into validating an HMAC “signature”, letting attackers use the public key as if it was a secret key, instant pwnie.
I’m not sure Coze is currently be prone to that kind of confusion however. I’m not seeing any HMAC based authentication, and the only example I’ve seen where "alg" is different from ES256, the digest is not stored in the "sig" field, but in "id".
Hopefully the authors were aware of this problem and already planned to avoid it. (Just the fact they’re trying to replace JWT suggest they might be). Thank you for the warning regardless.
In my experience working on large scale encrypted messaging systems size of individual messages is an overwhelming part of the difficulty of deployment - making a format “readable” should not be a goal.
Using JSON is generally questionable, as it opens you to a variety of issues caused by the format support arbitrary extension, duplicate labels, and generally a lack of anything other than basic syntactic validation. It also drastically impacts code size: encryption of data means you have large amounts of random bytes, which all are subject to base64 encoding as nothing else can represent random bytes in json. The size of a single block message with this protocol is absurd.
Even the core idea of “human readable” encryption is nonsense: more or less everything other than field names, or in this protocol the encryption scheme name, is random streams, which are definitionally not human readable, and being ascii characters that are valid in a string doesn’t change that. This protocol then even breaks that basic idea by using abbreviations for all the field names, and while some are inferable if you know the domain, the bulk are not.
Finally, cryptographic agility is an anti-feature. If you must, you use a protocol version, and the version specifies the exact primitives that will be used - messages do not specify the primitives to use or combine directly. If any kind of agreement is needed then schema selection is just the highest version both support. Of course you then need a bunch of additional protocol to ensure that a MiTM can’t downgrade you (recall that the negotiation occurs before you have a secure link)
I believe this is about signatures, not encryption.
What they call “agility” is actually a form of versioning. Downgrade attacks are still a thing with versions, but at least there’s a plan to phase out the old stuff.
MiTM is less of a problem with signatures. (You need to validate the public key regardless.)
I’m terrible at naming things. When we first started work on it I dubbed it the “radical cypher”. We wanted a better name and were joking about naming it “cyphr-JOSE” which shortened is cjose, (pronounced “josey”), which sounds like cozy. When I looked up “coze” and it meant “a friendly talk; a chat”, we were immediately sold. But yes, both names Coze and COSE were influenced by JOSE.
Alternatively, we were just going to name it “cozy” or “coz”.
I would say Coze is more generalized than PASETO. I think PASETO was written more as response to JWT while Coze is more like JOSE (minus the encryption).
Coze
Is JSON.
Uses digests heavily in the design.
Focuses on signing messages of any kind. This includes session tokens.
Permits several cipher suits (“algs”) and hopes to expand with new industry standards. (Currently ES244, ES256, ES384, ES512, Ed25519, Ed25519ph)
Defines a key format.
Signing, not encryption, is Coze’s focus.
PASETO’s
Is not JSON.
Does not use digests as references (For example, uses the public key directly.)
Focuses on security tokens.
Supports (two?) cipher suites (v3, v4)
I don’t think it defines a key format.
Supports encryption.
Both Coze and PASETO use b64ut for encoding binary values.
I’m not up-to-date with current PASETO (I dug originally into v1 but I understand that it has been deprecated), but it seems to look about the same.
I’ve done some design & prototyping of cryptographic JSON messages going back to 2011, but never shipped anything. This spec actually looks pretty close to what I came up with, though I haven’t read it closely yet.
The biggest headache is defining a canonical form of JSON, for creating signatures. Of your spec, the only thing I disagree with is defining the canonical list of keys and their ordering. The problem with this is you can’t encode/marshal to JSON (from language-specific data structures) without knowing that list. It’s not a problem in Go since I assume you define your own MarshalJSON method for your struct, but not every language supports this. (It was a real pain in the ass in Objective-C.)
What I did instead was
declare a general rule for what keys are included/ignored: my spec ignored any key beginning with an underscore
ordered keys by lexicographic comparison (strcmp) of their UTF-8 encoding.
There are a few other details of canonical encoding.
You have to specify exact rules for escapes in strings: I decreed that the only escaped characters are double-quote, backslash, and control characters.
Numbers have to be written in integer form (no decimal point or exponent) if possible. Beyond that it gets ugly — it’s quite difficult to specify or implement a unique encoding of non-integers that has round-trip fidelity after it gets parsed to an IEEE double and back to ASCII. I never had a need to put non-integers, or integers with a magnitude above 2^53, in a signed message!
Canonicalization took a long time to get right until we were happy with a solution. In Go Coze, we have normal and mapslice. Mapslice was needed since Go maps are not explicitly orderable. Canonicalization in Coze JS was easier to implement.
Originally, we required that message had to be in UTF-8 order, or in UTF-8 order after the standard Coze fields. In practice, it didn’t feel ergonomic. (See the old “alg_first” document),
By allowing a flexible canon, we can do things like the following where “Random_junk” is put after other significant fields:
Just to make this clear for future readers: mapslice is unsound (it contains data races) and definitely should not be used by any project going forward🫡
You have to specify exact rules for escapes in strings: I decreed that the only escaped characters are double-quote, backslash, and control characters.
the characters that must be escaped [in strings are] quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
–
Numbers have to be written in integer form (no decimal point or exponent) if possible.
Unfortunately JSON does not guarantee any precision for numbers at all, leaving that up to the implementation. Meaning if you want reliable serialization, you actually have to encode numbers as strings.
Right, in my spec only the bare minimum of escaping is allowed.
I don’t see the absence of precision in the JSON spec as a real-world problem. Any JSON implementation that can’t handle at least 32-bit ints won’t be much use. I suppose one could restrict canonical numbers to fit in 32-but, but I don’t actually know of any JSON implementation that doesn’t support at least 53 bits.
I don’t see the absence of precision in the JSON spec as a real-world problem.
Depends on what you need.
The JSON encoded form of an integer 123 can be {"v": 123} or { "v" : 123.0 } or { "v": 123.000} or {"v":123.00000000000001}. All of those values are valid JSON numbers. They’ll be parsed by implementations into values that will (probably) evaluate as equivalent.
JSON numbers don’t — can’t! — have a canonical encoded form. If that’s not a problem for your use case, no problem.
I think you misunderstand the task at hand, which is to specify a single canonical way to encode values as JSON such that any encoder will produce the same output string given the same values. To that end, it doesn’t matter that there are multiple ways to encode a number: the algorithm specifies the exact one to use (the first one, in your example.)
You can define those rules and encode JSON in that way, sure. But when you’re decoding a JSON payload you can’t assume anything beyond what’s guaranteed by the specification. If you want to assume that canonical encoding on both the sender and receiver side, then you’re not using JSON, you’re using your own JSON derivative. That’s fine! Makes perfect sense in a closed ecosystem. But endpoints that expect that encoding can’t accept content-type: application/json anymore.
Again, I think you misunderstand. The actual payload can be sent as regular JSON encoded in any legal way — go ahead and add decimal points, extra escapes in strings, etc.
The only thing that pays attention to the canonical-form rules is the digital signature logic. To sign a JSON object you first encode it in canonical form, then sign that byte string. To verify a signature you encode the object in canonical form, then verify the signature with that string. The only place the canonical form appears is within that logic.
Ah, OK, sure, I thought you were making stronger claims than you were. Sure, I’ve even done the same recently. But even then, numbers don’t have well defined precision, so can’t produce canonical bytes, strictly speaking. If you don’t care, then (shrug) of course.
“Human readability” is the main selling point in the description and the second of the four design goals, yet you chose to make every single JSON key an arcane three-letter abbreviation?
Most field names are fairly descriptive, (
alg
,key
,sig
,tmb
,typ
,d
,x
,pay
) and those that are not (cad
,czd
) don’t refer to existing equivalents. We feel thatcad
is nearly as useful ascanonical_digest
, howevercad
is much more succinct. Small JSON field names keep messages small, and as a consequence Coze messages are frequently smaller than existing alternatives.I am a co-author and I am happy to answer any questions!
Why did you decide cryptographic agility was desirable?
I have discussed with one of them privately, and from what I gathered this is more about versioning than actual agility. They mean to phase out obsolete primitives, not allow the kind of dangerous negotiations that could enable downgrade attacks.
I haven’t reviewed Coze thoroughly enough to have a strong opinion on whether they made the same mistakes JWT did on that front however. It seems to be simpler at least, so there should be less room for error.
Thanks Loup!
(And Horvski is chatting with me while we reply to questions)
I think there’s general agreement on good practice but the definition is a little murky and that’s the source of concern. I think Bruce Schneier’s attitude is correct: “it’s vital that our systems be able to easily swap in new algorithms when required.” We agree that upgrading applications to new algorithms should be foreseen by developers who should be prepared for such events.
We define cryptographic agility as not overly coupling systems to single cryptographic primitives. Instead systems need to be designed to support future swap outs quickly even if those systems only ever deploy a single primitive at a time. For example, we assume that ES256 may need to be changed out soon for post-quantum. It’s a good idea to design systems to be prepared for that change even if they only support a single algorithm at a time.
That’s the fine line: “so that they can support multiple cryptographic primitives and algorithms at the same time” is not the same as “currently deploying multiple cryptographic primitives at the same time.” An agile system may use a single primitive. The important aspect of crypto agility is being able to change out algorithms quickly if the need arises.
From your blog (great post!), Coze works along the lines of “Versioned Protocols” where there is no negotiation or intermixing. (ES256 only ever deals with ES256, and ES256 is defined rigidly.)
To be clear: Versioned Protocols meet the goals of Schneier’s quote. In-band negotiation does too, but comes with a lot of security foot-guns.
By having an algorithm header that isn’t “v1”, “v2”, etc. you’re opening yourself to an unnecessary amount of attack surface.
For example: ECDSA doesn’t provide exclusive ownership. This means that an attacker can generate two signatures such that:
This property matters a lot if you either:
I suspect the second condition is relevant for ES256.
If you wanted to fix this, you could do it in a version 2 of the protocol (prefix the message with the public key in the hash being signed). But that means you have to describe a new algorithm (e.g.
ES256v2
) and an attacker may be able to downgrade between the two.There’s a more general argument here: let’s say I’m making an awesome protocol or file format, and mark it “v1”. Put it all in production, happy users and all… and then you come in and point out a serious weakness in the format, forcing me to upgrade it to “v2”. Oops.
The problem is, transitioning from one to the other will take some time. Ideally just days, but depending on the context one might have to support both formats for weeks, months, or even years. During that period, recipients and responders need to accept this “v1” format, effectively allowing downgrade attacks.
The only safe way to handle this I know of (at least in the general case where you can’t implement specific, backward compatible mitigations), is to break users overnight, and force all providers to upgrade now. Coordinated release, CVE style.
Do you know of a better way?
As my above blog post indicates, this has a minimum number of rollout steps. I can’t envision doing it in fewer.
Fair enough, I guess I should have re-read your blog post before I asked. My problem was that you were saying there’s a difference between “v1”, “v2”, and “ALG”, “ALG_v2”. In the case of Coze I don’t believe there is: it’s just a version number that happens to be descriptive.
For instance the potential weakness you describe would still be there if they replaced “ES256” by “v1.0.0”, and the same rollout steps would be needed. The increased attack surface doesn’t come from the naming, it comes from the simultaneous use of different algorithms. Which may be encouraged by the descriptive version name, but to be honest I don’t anticipate more than a marginal effect.
No, the difference is between the following choices:
("v1", "v2", "v3", ...)
("ES256", "HS256", "ES256v2", ...)
The ES256/HS256 mode has been known to break JWT. I also believe that tolerating both
ES256
andES256v2
in the same context is risky, but a necessary temporary risk for making migrations possible.Your protocol should be easily upgradable to use newer cryptography in response to new attacks. But these upgrades should be defined by cryptographers, not users.
Has it? I’ve just read your link, and it seems to me the real culprit is the downgrade attack to “trust me bro” (the
none
algorithm), not ES256 itself. I’m also not aware of anything like thenone
algorithm in Coze, maybe @Zamicol could confirming this.Regardless of the specifics, it clearly seems to me that your objections do not apply to descriptive names in general. You’re arguing against a specific algorithm. And that’s great, thank you for the feedback. I’m a big fan of rooting out bad algorithms before going to production.
Sure. Version numbers can do that. And so can descriptive names. The only qualm I would have here is that descriptive names without a version number would make it harder to know which is the latest.
Sure. And from what I gathered those upgrades will be defined by the Coze devs, so that should cover it.
And again, I see little difference between version numbers and descriptive names. Granted, descriptive names are harder to sort, but that’s only relevant if implementers provide several algorithms and don’t set a clear default.
Don’t get me wrong, I’m not a fan of descriptive names as version numbers. I just don’t think they’re nearly as bad as you are suggesting.
The RS256 to HS256 algorithm confusion attack also works for ECDSA keys.
https://auth0.com/blog/critical-vulnerabilities-in-json-web-token-libraries/#RSA-or-HMAC
Ah, I see. Didn’t read far enough. The presence of both an asymmetric signing and symmetric authentication may confuse implementers or users into validating an HMAC “signature”, letting attackers use the public key as if it was a secret key, instant pwnie.
I’m not sure Coze is currently be prone to that kind of confusion however. I’m not seeing any HMAC based authentication, and the only example I’ve seen where
"alg"
is different fromES256
, the digest is not stored in the"sig"
field, but in"id"
.Hopefully the authors were aware of this problem and already planned to avoid it. (Just the fact they’re trying to replace JWT suggest they might be). Thank you for the warning regardless.
Coze does not have a “none” algorithm. Firstly, I don’t see the utility. Secondly yes, we were aware of the issues it caused for JWT.
Does the canonicalisation do anything with character escapes?
The only JSON aware operation canonicalization does is JSON compaction. Input JSON must be valid.
I have issues with this.
In my experience working on large scale encrypted messaging systems size of individual messages is an overwhelming part of the difficulty of deployment - making a format “readable” should not be a goal.
Using JSON is generally questionable, as it opens you to a variety of issues caused by the format support arbitrary extension, duplicate labels, and generally a lack of anything other than basic syntactic validation. It also drastically impacts code size: encryption of data means you have large amounts of random bytes, which all are subject to base64 encoding as nothing else can represent random bytes in json. The size of a single block message with this protocol is absurd.
Even the core idea of “human readable” encryption is nonsense: more or less everything other than field names, or in this protocol the encryption scheme name, is random streams, which are definitionally not human readable, and being ascii characters that are valid in a string doesn’t change that. This protocol then even breaks that basic idea by using abbreviations for all the field names, and while some are inferable if you know the domain, the bulk are not.
Finally, cryptographic agility is an anti-feature. If you must, you use a protocol version, and the version specifies the exact primitives that will be used - messages do not specify the primitives to use or combine directly. If any kind of agreement is needed then schema selection is just the highest version both support. Of course you then need a bunch of additional protocol to ensure that a MiTM can’t downgrade you (recall that the negotiation occurs before you have a secure link)
The name is awkwardly close to COSE, CBOR object signing and encryption
I’m terrible at naming things. When we first started work on it I dubbed it the “radical cypher”. We wanted a better name and were joking about naming it “cyphr-JOSE” which shortened is cjose, (pronounced “josey”), which sounds like cozy. When I looked up “coze” and it meant “a friendly talk; a chat”, we were immediately sold. But yes, both names Coze and COSE were influenced by JOSE.
Alternatively, we were just going to name it “cozy” or “coz”.
What are the advantages and disadvantages of this over PASETO, which already existed for quite some time?
I would say Coze is more generalized than PASETO. I think PASETO was written more as response to JWT while Coze is more like JOSE (minus the encryption).
Coze
PASETO’s
Both Coze and PASETO use b64ut for encoding binary values.
I’m not up-to-date with current PASETO (I dug originally into v1 but I understand that it has been deprecated), but it seems to look about the same.
Key serialization is defined in an “extension” called PASERK: https://github.com/paseto-standard/paserk
I’ve done some design & prototyping of cryptographic JSON messages going back to 2011, but never shipped anything. This spec actually looks pretty close to what I came up with, though I haven’t read it closely yet.
The biggest headache is defining a canonical form of JSON, for creating signatures. Of your spec, the only thing I disagree with is defining the canonical list of keys and their ordering. The problem with this is you can’t encode/marshal to JSON (from language-specific data structures) without knowing that list. It’s not a problem in Go since I assume you define your own MarshalJSON method for your struct, but not every language supports this. (It was a real pain in the ass in Objective-C.)
What I did instead was
There are a few other details of canonical encoding.
You have to specify exact rules for escapes in strings: I decreed that the only escaped characters are double-quote, backslash, and control characters.
Numbers have to be written in integer form (no decimal point or exponent) if possible. Beyond that it gets ugly — it’s quite difficult to specify or implement a unique encoding of non-integers that has round-trip fidelity after it gets parsed to an IEEE double and back to ASCII. I never had a need to put non-integers, or integers with a magnitude above 2^53, in a signed message!
I would be really interested in seeing your work!
Canonicalization took a long time to get right until we were happy with a solution. In Go Coze, we have normal and mapslice. Mapslice was needed since Go maps are not explicitly orderable. Canonicalization in Coze JS was easier to implement.
Originally, we required that message had to be in UTF-8 order, or in UTF-8 order after the standard Coze fields. In practice, it didn’t feel ergonomic. (See the old “alg_first” document),
By allowing a flexible canon, we can do things like the following where “Random_junk” is put after other significant fields:
Just to make this clear for future readers: mapslice is unsound (it contains data races) and definitely should not be used by any project going forward🫡
Also for future readers, the current implementation of MapSlice is not concurrency safe. Please see the notes https://github.com/Cyphrme/Coze/issues/10#issuecomment-1498196356.
Eventually we hope to drop MapSlice entirely by replacing it with JSONv2.
Which aligns with the JSON RFC when it says that
–
Unfortunately JSON does not guarantee any precision for numbers at all, leaving that up to the implementation. Meaning if you want reliable serialization, you actually have to encode numbers as strings.
Right, in my spec only the bare minimum of escaping is allowed.
I don’t see the absence of precision in the JSON spec as a real-world problem. Any JSON implementation that can’t handle at least 32-bit ints won’t be much use. I suppose one could restrict canonical numbers to fit in 32-but, but I don’t actually know of any JSON implementation that doesn’t support at least 53 bits.
Depends on what you need.
The JSON encoded form of an integer 123 can be
{"v": 123}
or{ "v" : 123.0 }
or{ "v": 123.000}
or{"v":123.00000000000001}
. All of those values are valid JSON numbers. They’ll be parsed by implementations into values that will (probably) evaluate as equivalent.JSON numbers don’t — can’t! — have a canonical encoded form. If that’s not a problem for your use case, no problem.
I think you misunderstand the task at hand, which is to specify a single canonical way to encode values as JSON such that any encoder will produce the same output string given the same values. To that end, it doesn’t matter that there are multiple ways to encode a number: the algorithm specifies the exact one to use (the first one, in your example.)
You can define those rules and encode JSON in that way, sure. But when you’re decoding a JSON payload you can’t assume anything beyond what’s guaranteed by the specification. If you want to assume that canonical encoding on both the sender and receiver side, then you’re not using JSON, you’re using your own JSON derivative. That’s fine! Makes perfect sense in a closed ecosystem. But endpoints that expect that encoding can’t accept
content-type: application/json
anymore.Again, I think you misunderstand. The actual payload can be sent as regular JSON encoded in any legal way — go ahead and add decimal points, extra escapes in strings, etc.
The only thing that pays attention to the canonical-form rules is the digital signature logic. To sign a JSON object you first encode it in canonical form, then sign that byte string. To verify a signature you encode the object in canonical form, then verify the signature with that string. The only place the canonical form appears is within that logic.
Ah, OK, sure, I thought you were making stronger claims than you were. Sure, I’ve even done the same recently. But even then, numbers don’t have well defined precision, so can’t produce canonical bytes, strictly speaking. If you don’t care, then (shrug) of course.
I am thrilled that the discussion is getting into these sorts of concerns. Thank you both for your thoughts.
Some of this discussion seems to apply to a new issue opened on Coze: https://github.com/Cyphrme/Coze/issues/14