These kinds of “Falsehood” articles often raise more questions than they (seem to) attempt answering. Take the last point for example
People have names.
What is this supposed to tell me? Why not at least link to some resource or source explaining the issue with suggestions for peoole who want to solve X-related problems. All I could conclude from this would to be just to forget about names beyond these just being metadata… or was that the point?
I think raising questions is precisely the goal of such articles. To make implicit assumptions explicit and verify them in the context of a particular project.
Mgy favourite falsehood about Unicode — toUpper/toLower does not change the length of a string. At least when measured in graphemes. For latin-1. — is sadly coming to an end nowadays.
Previously, “ß” uppercase equivalent was “SS”, it compared equal to “SZ” and “SS”, and “SS” lowercase equivalent was “ss”, but “ss” and “ß” were not equal.
Now that ẞ exists instead as official uppercase form for ß and is to be used since 2017, the next version of Unicode is going to standardize ẞ.
Historical typefaces offering a capitalized eszett mostly date to the time between 1905 and 1930. The first known typesets to include capital eszett were produced by the Schelter & Giesecke foundry in Leipzig, in 1905/6. Schelter & Giesecke at the time widely advocated the use of this type, but its use remained very limited.
… and it’s been in Unicode since 2008:
Capital ß (ẞ) was introduced as part of the Latin Extended Additional block in Unicode version 5.1 in 2008 (U+1E9E ẞ LATIN CAPITAL LETTER SHARP S).
The only thing that changed in 2017 was the opinion of the Council for German Orthography.
This wavers a bit, from dealing with properties of names (“People’s first names and last names are, by necessity, different.”) and properties of technology (“People’s names are written in any single character set.”) which is a bit odd, and makes it something of a rant about how limited text-handling still is: Yes, there are characters which do not exist in Unicode, or any character encoding standard, and I’m sure some people write their names with them. However, that issue would come up any time those characters are used, in a name or not, and will cease to be a problem eventually given that Unicode continues to expand.
These kinds of “Falsehood” articles often raise more questions than they (seem to) attempt answering. Take the last point for example
What is this supposed to tell me? Why not at least link to some resource or source explaining the issue with suggestions for peoole who want to solve X-related problems. All I could conclude from this would to be just to forget about names beyond these just being metadata… or was that the point?
I think raising questions is precisely the goal of such articles. To make implicit assumptions explicit and verify them in the context of a particular project.
Mgy favourite falsehood about Unicode — toUpper/toLower does not change the length of a string. At least when measured in graphemes. For latin-1. — is sadly coming to an end nowadays.
Previously, “ß” uppercase equivalent was “SS”, it compared equal to “SZ” and “SS”, and “SS” lowercase equivalent was “ss”, but “ss” and “ß” were not equal.
Now that ẞ exists instead as official uppercase form for ß and is to be used since 2017, the next version of Unicode is going to standardize ẞ.
The capital sharp s existed long before Unicode:
… and it’s been in Unicode since 2008:
The only thing that changed in 2017 was the opinion of the Council for German Orthography.
https://en.wikipedia.org/wiki/Capital_%E1%BA%9E
Yes, I realize that – but it’s expected that the next version of Unicode is going to standardize the new capitalization rules, which they haven’t yet.
If you like this kind of thing: http://yourcalendricalfallacyis.com/
https://github.com/kdeldycke/awesome-falsehood
This wavers a bit, from dealing with properties of names (“People’s first names and last names are, by necessity, different.”) and properties of technology (“People’s names are written in any single character set.”) which is a bit odd, and makes it something of a rant about how limited text-handling still is: Yes, there are characters which do not exist in Unicode, or any character encoding standard, and I’m sure some people write their names with them. However, that issue would come up any time those characters are used, in a name or not, and will cease to be a problem eventually given that Unicode continues to expand.