1. 3

    This seems actually useful, could replace a good part of my web searches that inevitably end with me search for the copy-and-pastable symbole within a fileformat.info result.

    The searching seems to need some tweaks, though. E.g. looking for a regular smiley, none of “smile”, “smiley”, “happy” give the wanted result, while “face” lists too many. It turns out the right search word is “smiling”, but maybe there should be some form of aliases?

    I also had trouble with the regular red heart, but that may be of a different kind?

    $ echo "❤️ " | ~/go/bin/uni identify
         cpoint  dec    utf-8       html       name
    '❤'  U+2764  10084  e2 9d a4    ❤   HEAVY BLACK HEART (Other_Symbol)
    '◌️'  U+FE0F  65039  ef b8 8f    ️   VARIATION SELECTOR-16 (Nonspacing_Mark)
    

    How would I find this using search?

    Regarding search, some more ideas:

    • looking up by emoticon, e.g., uni identify "(:" or uni identify "<3"
    • looking up by short code, e.g. uni identify :heart: (are these standardized?)

    And a bit of a bug regarding that other stdin UX thread:

    $ echo "" | ~/go/bin/uni identify
    $ i: reading from stdin...
    
    1. 2

      The searching seems to need some tweaks, though. E.g. looking for a regular smiley, none of “smile”, “smiley”, “happy” give the wanted result, while “face” lists too many. It turns out the right search word is “smiling”, but maybe there should be some form of aliases?

      Yeah, adding more search terms is marked as “TODO” in the code. It’s a bit tricky as it’s very easy to get way too many matches and/or pollute the output with a lot of keywords, which isn’t useful either. This is one reason I worked on a GUI emoji picker based on this code last week, but I had a lot of problems getting GTK to show ZJW sequences well, so I kind of gave up on that for now, but basically I’m running in to the limitations of dmenu’s plain text filtering.

      I rarely use uni e <search> by the way, but instead use the “emoji-common” groups from dmenu-uni which reduces the number of emojis to a more manageable number (from about 1600 to 200).

      I also had trouble with the regular red heart, but that may be of a different kind? [..] How would I find this using search?

      Just in case this wasn’t clear – and the documentation should probably make this a bit clearer – but the print, search, and identify commands work only on codepoints. They have no concept of multiple codepoints combing to form a single character (or “graphmeme”, if you wish). I basically use identify mostly as a “Unicode-aware hexdump -C”.

      At any rate, it shows up with e.g. uni emoji heart, or uni emoji ‘red heart’for an exact match. It's a bit hidden in there, because apparently we need hearts in 20 shapes and colours 🤷‍♂️ You have the same when you type:heart` in e.g. WhatsApp, but because the emojis are shown in colour and quite large it’s reasonably obvious. This is again kind of running in to the limits of what you can do with this kind of plain text search.

      1. 2

        HEAVY BLACK HEART is the name of the red heart, as it was named as such before emoji gained color. For older Unicode characters (before color), “white” means outlined and “black” means filled in.

        1. 1

          The search problem is pretty tough to solve, as some of the unicode descriptions use a particular english dialect, for instance:

          $ uni s poop
          no matches
          

          damn British! :)

          One possible solve would be to augment the descriptions with information from another free source, like wikipedia

          1. 3

            That’s actually specified in the Unicode CLDR (“Common Locale Data Repository”):

            $ grep poop en.xml
            <annotation cp="💩">dung | face | monster | pile of poo | poo | poop</annotation>
            

            It contains many useful aliases, for example for the pirate flag:

            <annotation cp="🏴‍☠️">Jolly Roger | pirate | pirate flag | plunder | treasure</annotation>
            

            I just haven’t added support for that.

            1. 1

              oh very cool

        1. 4

          Thanks so much for this tool, I love having a command line utility to query the unicode database!

          1. 2

            bash & shellcheck & shfmt, I have come to love those combination of tools for quickly automating parts of my workflow. I often pipe my shell history into a file and then create a quick script to automate something I am working on, fc -l > script. I think of bash more as a REPL than just as a command shell.

            1. 2

              Like the author I struggled with users who were use to emacs mode and so they couldn’t switch to vi mode without losing their muscle memory. But, I discovered that the vi bindings in INSERT mode are largely a subset of those in emacs mode, so I created a readline config with a vi mode which has all of the emacs key bindings as well. This way folks can use emacs and vi keybindings. I also change the cursor to indicate which vi mode you are in.