1. 28
  1.  

  2. 13
    • Old Uyghur, an historic script used in Central Asia and elsewhere to write Turkic, Chinese, Mongolian, Tibetan, and Arabic languages

    Huh, wonder if there was any committee drama about adding that one…

    1. 17

      Personally, I’m excited about the “melting face” emoji, as well as a few of the new gender and skin color variants that will piss off chuds. Also, the Emoji block of Unicode now has not one, but two amulets against the evil eye, which I expect will be extremely valuable for social media.

      1. 6

        I’m torn between “melting face” and “dotted line face”, I think they’ll replace my usage of 🙃going forwards.

        1. 5

          The emoji thing is so totally irresponsible. Humanity is never going to replace Unicode. We’re stuck with it until we either go extinct or go Luddite. Adding emoji based on whims is how you end up with things like this sticking around for four thousand years and counting. The Egyptians at least had the excuse that they didn’t know what computers were.

          1. 11

            I actually have that one saved in my favorites in UnicodePad for Android. Of course, the modern spelling would be 🍆💦.

            1. 10

              Adding emoji based on whims is how you end up with things like this sticking around for four thousand years and counting. The Egyptians at least had the excuse that they didn’t know what computers were.

              Not sure what the problem is? Ancient Egyptians living thousands of years ago didn’t share your particular cultural taboos and sensitivities, which seems like an entirely valid “excuse” to me.

              1. 2

                Right, there’s nothing that the Egyptians were doing “wrong”, because when they decided to use a penis as a letter, they had no way of knowing that for the remainder of human civilization we will have to use the penis as a letter, whether it’s culturally taboo or cool or we replace men with artificial sex bots or whatever. We however do know that Unicode is forever, and so the bar to adding a new character should be really fucking high. Like, here is an alphabet that was already in use by a non-trivial amount of people for some length of time. Not, it would be cool to make a new kind of smiley face.

                A better system would be to do what is already done with flags. For flags, the flag 🇺🇸 is just

                U+1F1FA 🇺       REGIONAL INDICATOR SYMBOL LETTER U
                U+1F1F8 🇸       REGIONAL INDICATOR SYMBOL LETTER S
                

                We could do the same thing for other ephemera, and not have to burden Unicode with an open ended and endless list of foods that were popular with the Unicode committee in the 21st century.

                1. 10

                  We don’t “have to use the penis as a letter” because it exists in Unicode. It’s just that it is technically representable. I’ll admit there’s nuance here - there are probably some things I’d rather see us avoid in Unicode, i.e. violence. But I’m struggling to see the harm caused in this particular case.

                  1. 5

                    Who’s to say that the United States will still be around in 4,000, 1,000, or 200 years? Or that the “US” code won’t be recycled for some other country? Hell, why should our current ISO system of labelling countries even persist? Once you start talking about these kind of timeframes anything is up for grabs really.

                    “Forever” is a heck of a long time. I don’t think we’re stuck with Unicode for all eternity, there’s all sorts of ways/scenarios we could come up with something new. I think we should just address the issues of the day; there’s no way what the future will be like anyway; all we can do is focus on the foreseeable future.

                    1. 3

                      I just imagined some kind of a Unicode successor system that would have a “compatibility” block with 200k+ slots and groaned.

                      1. 1

                        That’s the whole point. US won’t mean 🇺🇸 forever. It will naturally change over time and when it does, the old codes will still be decipherable (flag for something called “US”) without needing to be supported anymore.

                        Tbh, the most likely a scenario is a RoC, PRC thing where two countries claim to be the US, and then the international community will have to pick sides. Anyway, still better than having the flag as a real emoji!

                        1. 2

                          I don’t really follow how one scheme is more advantageous over the over; at the end of the day you’re still going have to map some “magic number” to some specific meaning. I suppose you could spell out “happy” or “fireman” in special codepoints, but that just seems the same as mapping specific codepoints to those meaning, but with extra steps (although “fireman” already consists of two codepoints: “man” + “fire engine”, or “person” and “women” for other gender variants).

                          The reason it’s done with flags probably has more to do that it’s just easier.

                          1. 1

                            It’s not just that it’s easier it’s that obsolescence is a built in concept. New countries come and old countries go and ISO adds and removed country codes. Using Slack and GitHub style :name: emojis mean that you can add and drop support for specific emoji without needing to just serve up a �. It is also more forward compatible. When your friend on a new phone texts you :dotted smiley: you won’t just see �, you’ll see words that describe what is missing. Plus you aren’t using up a finite resource.

                            1. 3

                              Plus you aren’t using up a finite resource.

                              TIL integers are a finite resource.

                              1. 2

                                To be fair, I’ll be the first to grab popcorn when they announce that everyone and their toaster now has to adopt utf8 with 1-5 bytes. Will probably be as smooth and fast as our ipv4 to ipv6 migration.

                              2. 1

                                When your friend on a new phone texts you :dotted smiley: you won’t just see �

                                Right, that would be useful.

                                Changing the meaning of specific codepoints or sequences of codepoints over time just seems like a recipe for confusion. “Oh, this 300 year old document renders as such-and-such, but actually, back then it meant something different from today” is not really something that I think will help anyone.

                                This already exists to some degree; e.g. “Ye olde tarvern” where “Y” is supposed to represent a capital Thorn, which is pronounced as “th”, not as Y (written as þ today, but written quite similar to Y in old-fashioned cursive writing, and early German printing presses didn’t have a Thorn on account of that letter not existing in German so people used Y as a substitute). In this case it’s a small issue of pronunciation, but if things really shift meaning things could become a lot more apt to misunderstandings in meaning.

                                1. 1

                                  Emoji have already shifted in ways that change their meaning. The gun emoji has become a ray gun due to political correctness and sometimes points left and sometimes right. Shoot me → zap you. The grimace 😬 was a different emotion on Android and iOS for a while. There are other documented examples of this kind of semantic shift in just a short amount of time. I think it’s a bit hopeless to try to pin them down while you keep adding stuff. The use of eggplant and peach for penis and butt is based on their specific portrayal and subject to the visual similarity being lost if they redraw them in different ways. What if President Xi demands a less sexy 🍑? Will it stick around or be a bit of passing vulgar slang from the early twentieth century? Impossible to predict.

                    2. 4

                      Why can’t we have a little fun? What is the problem you are seeing with this?

                      1. 3

                        Your body shame is a culturally specific artifact and hardly a universal experience.

                        1. 1

                          You’re missing the point. I’m not ashamed of weiners. They are hilarious. The point is that a character can be taboo or not and we’re still stuck with it.

                          1. 2

                            If it’s not something to be ashamed of, is it really taboo enough to exclude from the Unicode standard? And furthermore, why is being stuck with it an issue? It can even be valuable from an anthropological standpoint.

                        2. 1

                          Just because a standard exists doesn’t mean we have to use all of it all the time.

                          Or should the ASCII maintainers be embarrassed that their standard contains Vertical Tab?

                          1. 2

                            My dude, Unicode inherited vertical tab from ASCII. That’s my point. Things are only going to continue to accumulate for now until the collapse of civilization. It will never shrink.

                      2. 9

                        They could have quit 10 or 15 years ago and the world would have been no worse off. Let the historical linguists hash things out in their own private-use blocks, and let people send little vector graphics to each other in a way that doesn’t require assigning them as part of an international standard and then burdening every goddamn computing device for the rest of eternity.

                        1. 15

                          There is more than historical scripts being added. For example:

                          Tangsa, a modern script used to write the Tangsa language, which is spoken in India and Myanmar

                          There are a good number of other, still alive scripts that are yet to be added. There is no reason to burden the internet infrastructure with vector images, when all you need is a decision and a font update. If your application really cares, maybe the character data update as well.

                          1. 3

                            Imagine if they had added a Turkish delight emoji back in the 1940s. Everyone loves Turkish delight, right!? Except no, to contemporary tastes it is chalky and gross. Well, every food emoji is a bet that for the remainder of human civilization, we will want to be able to refer to a culturally specific food with a particular 32-bit number. It’s crazy. We already have as an historical accident a bunch of Japanese emoji that make no sense in an American context. Why would you deliberately add more culturally specific junk to humanity’s permanent record? In conclusion, 𓂺.

                            1. 11

                              What is in humanity’s permanent record, aside from culturally specific junk? Besides, so far there are only 144,697 characters, 1,404 emoji, and whatever else is in there. There’s still room for a few thousand more Turkish delights.

                              1. 3

                                In the interest of various parties I demand Greek delights, Parthian delights, Scythian delights and Armenian delights. Luckily Unicode can handle them all!

                                1. 1

                                  I’d like to upvote the first sentence of this post 144,697*1404 times just for starters.

                                2. 2

                                  Is that really the consensus on Turkish delight? :-)

                                  1. 1

                                    This. Emoji is a necessary part of Unicode since it must subsume earlier encodings, but emoji is a Japanese phenomenon. Any new emoji (if ever) should be left up to the Japanese, just like any new kanji.

                                  2. 1

                                    What are fonts? :)

                                    1. 1

                                      turtle graphics on acid.