Threads for etc

    1. 8

      Hi! Here author presents source code as Prolog and implements IDE using that: https://lingo-workbench.dev/ = https://petevilter.me/post/datalog-typechecking/

      1. 3

        Very nice! This is definitely where I’m trying to go.

        I’ve built a structural editor prototype a bit ago, and it would be nice to use prolog (or datalog) to power the tooling.

      2. 10

        I understand the rationale for including the original emoji (unicode wants to be a superset of existing character sets) but they should have been put in a code space reserved for backwards compatibility with bad ideas, not made such a big part of unicode.

        At this point, there’s a strong argument for a new character set that is a subset of unicode that removes all of the things that are not text. We already have mechanisms for embedding images in text. Even in the ‘90s, instant messaging systems were able to avoid sending common images by having a pre-defined set of pictures that they referenced with short identifiers. This was a solved problem before Unicode got involved and it’s made text processing an increasingly complicated mess, shoving image rendering into text pipelines for no good reason.

        The web could have defined a URL encoding scheme for emoji from an agreed set, or even a shorthand tag with graceful fallback (e.g. <emoji img="gb-flag;flag;>Union Flag</emoji>, which would render a British flag if you have an image for gb-flag, a generic flag if you don’t, have ‘Union Flag’ as the alt text or the fall back if you don’t support emoji). With the explicit description and fallback, you avoid the things like ‘I’m going to shoot you with a 🔫’ being rendered as ‘I’m going to shoot you with a {image of a gun}’ or ‘I’m going to shoot you with a {image of a water pistol}’ depending on the platform: if you didn’t have the water-pistol image, you’d fall back to the text, not show the pistol image.

        1. 24

          Like it or not, emoji are a big part of culture now. They genuinely help convey emotion in a fairly intuitive manner through text, way better than obscure tone indicators. I mean, what’s more understandable?

          “Are you going to leave me stranded? 😛”

          “Are you going to leave me stranded? [/j]”

          It definitely changes the meaning of the text. They’re here to stay, and being in Unicode means they got standardized, and it wouldn’t have happened otherwise.

          Of course there’s issue with different icon sets having different designs (like how Samsung’s 😬 was completely different from everyone else’s), but those tend to get resolved eventually.

          1. 4

            Like it or not, emoji are a big part of culture now. They genuinely help convey emotion in a fairly intuitive manner through text, way better than obscure tone indicators.

            Except they don’t. Different in groups assign different meanings to different ones. Try asking someone for an aubergine using emoji some time and see what happens.

            “Are you going to leave me stranded? 😛”

            This is culturally specific. It’s an extra set of things that people learning English need to learn. This meaning for sticking out your tongue is not even universal across European cultures. And that’s one of the top ten most common reaction emoji, once you get deeper into the hundreds of others the meaning is even further removed. How would you interpret the difference between 🐶 and 🐕 in a sentence?

            Of course there’s issue with different icon sets having different designs (like how Samsung’s 😬 was completely different from everyone else’s), but those tend to get resolved eventually.

            That’s an intrinsic property of using unicode code points. They are abstract identifiers that tell you how to find a glyph. The glyphs can be different. A Chalkboard A and a Times A are totally different pictures because that’s an intrinsic property of text. If Android has a gun and iOS has a waterpistol for their pistol emoji, that’s totally fine for characters but a problem for images.

            1. 16

              😱 Sure emojis are ambiguous . And different groups can use them differently. But that doesn’t mean they don’t convey meaning? The fact that they are so widely used should point towards them being useful no? 😉

              1. 7

                I never said that embedding images in text is not useful. I said that they are not text, do not have the properties of text, and treating them as text causes more problems than it solves.

                1. 4

                  Emoji are not alphabets, syllabaries, abugidas, or abjads. But they are ideograms, which qualifies them as a written script.

                  1. 1

                    I disagree. At best, they are precursors of an ideographic script. For a writing system, there has to be some kind of broad consensus on semantics and there isn’t for most emoji beyond ‘that is a picture of X’.

                    1. 3

                      For a writing system, there has to be some kind of broad consensus on semantics

                      Please describe to me the semantics of the letter “р”.

                      1. 1

                        Please describe to me the semantics of the letter “р”.

                        For alphabetic writing systems, the semantics of individual letters is defined by their use in words. The letter ‘p’ is a component in many of the words in this post and yours.

                        1. 6

                          Thank you! (That was actually U+0440 CYRILLIC SMALL LETTER ER, which only featured once in both posts, but no matter.)

                          the semantics of individual letters is defined by their use in words

                          The thing is, I disagree. “e” as a letter itself doesn’t have ‘semantics’, only the words including it do[1]. What’s the semantics of the letter “e” in “lobster”? An answer to this question isn’t even wrong. It gets worse when different writing systems interpret the same characters differently: if I write “CCP”, am I referring to the games company CCP Games? Or was I abbreviating сoветская социалистическая республика? What is the semantics of a letter you cannot even identify the system of?

                          Emoji are given meaning of different complexity by their use in a way that begins to qualify them as logographic. Most other writing systems didn’t start out this way, but that doesn’t make them necessarily more valid.

                          [1]: The claim doesn’t even hold in traditional logographic writing systems which by all rights should favor your argument. What is the semantics of the character 湯? Of the second stroke of that character? Again, answers aren’t even wrong unless you defer to the writing system to begin with, in which case there’s no argument about (in)validity.

              2. 13

                Except they don’t. Different in groups assign different meanings to different ones.

                This is true of words as well.

                1. 3

                  Yes, but their original point is that we should be able to compose emojis like we compose words, as in the old days of phpBB and instant messaging. :mrgreen:

                  1. 13

                    Just a nit: people do compose emojis - I see sequences of emojis all the time. People send messages entirely of emojis that other people (not necessarily me) understand.

                    1. 9

                      The fact that an in-group can construct a shared language using emoji that’s basically opaque to outsiders is probably a big part of their appeal.

                      1. 8

                        Yeah, and also there’s nothing wrong with that, it’s something any group can and should be able to do. I have no entitlement to be able to understand what other people say to each other (you didn’t claim that, so this isn’t an attack on you. I am just opposed to the “I don’t like / use / understand emojis how other people use them therefore they are bad” sentiment that surfaces periodically).

                    2. 4

                      That’s fair, I’m just nitpicking a specific point (that happens to be a pet peeve of mine).

                    3. 2

                      This is true of words as well.

                      But not of characters and rarely true even of ideographs in languages that use them (there are exceptions but a language is not useful unless there is broad agreement on meaning). It’s not even true of most words, for the same reason: you can’t use a language for communication unless people ascribe the same meaning to words. Things like slang and jargon rarely change more than a small fraction of the common vocabulary (Clockwork Orange aside).

                      1. 6

                        Without getting into the philosophy of what language is, I think this skit best illustrates what I mean (as an added bonus, emoji would have resolved the ambiguities in the text).

                        Note I’m not arguing for emoji to be in Unicode, I’m just nitpicking the idea that the problem with them is ambiguity.

                        1. 2

                          Socrates would like to have a chat with you. I won’t go through the philosophical tooth-pulling that he would have enjoyed, but suffice it to say that most people are talking past each other and that most social constructions are not well-founded.

                          I suspect that this is a matter of perspective; try formalizing something the size of English Prime (or, in my case, Lojban) and see how quickly your intuitions fail.

                  2. 15

                    I understand the rationale for including the original emoji (unicode wants to be a superset of existing character sets) but they should have been put in a code space reserved for backwards compatibility with bad ideas, not made such a big part of unicode.

                    Except emoji have been absolutely stellar for Unicode: not only are they a huge driver of adoption of unicode (and and through UTF8) because they’re actively desirable to a large proportion of the population, they’ve also been a huge driver of improvements to all sorts of useful unicode features which renderers otherwise tend to ignore despite their usefulness to the rendering of actual text, again because they’re highly desirable and platforms which did not support them got complaints. I fully credit emoji with mysql finally getting their heads out of their ass and releasing a non-broken UTF8 (in 2010 or so). That’s why said unicode consortium has been actively leveraging emoji to force support for more complex compositions.

                    And the reality is there ain’t that much difference between “image rendering” and “text pipeline”. Rendering “an image” is much easier than properly rendering complex scripts like arabic, devanagari, or burmese (or Z̸̠̽a̷͍̟̱͔͛͘̚ĺ̸͎̌̄̌g̷͓͈̗̓͌̏̉o̴̢̺̹̕), even ignoring that you can use text presentation if you don’t feel like adding colors to your pileline.

                    Even in the ‘90s, instant messaging systems were able to avoid sending common images by having a pre-defined set of pictures that they referenced with short identifiers.

                    After all what’s better than one standard if not fifteen?

                    This was a solved problem before Unicode got involved and it’s made text processing an increasingly complicated mess, shoving image rendering into text pipelines for no good reason.

                    This problem was solved by adding icons in text. Dingbats are as old as printing, and the Zapf Dingbats which unicode inherited date back to the late 70s.

                    The web

                    Because nobody could ever want icons outside the web, obviously. As demonstrated by Lucida Icons having never existed.

                    1. 10

                      subset of unicode that removes all of the things that are not text

                      It sounds like you disagree solidly with some of Unicode’s practices so maybe this is not so appealing, but FWIW the Unicode character properties would be very handy for defining the subset you’d like to include or exclude. Most languages seem to have a stdlib interface to them, so you could pretty easily promote an ideal of how user input like comment boxes should be sanitized and offer your ideal code for devs to pick up and reuse.

                      1. 8

                        new character set that is a subset of unicode that removes all of the things that are not text

                        and who’d be the gatekeeper on what the text is and isn’t? What would they say about the ancient Egyptian hieroglyphs? Are they text? If yes, why, they are pictures. If no, why, they encode a language.

                        It might be a shallow dissimilar, but people trying to tell others what forms of writing text are worthy of being supported by text rendering pipelines gets me going.

                        If the implementation is really so problematic, treat emojis as complicated ligatures and render them black and white.

                        1. 3

                          and who’d be the gatekeeper on what the text is and isn’t? What would they say about the ancient Egyptian hieroglyphs? Are they text? If yes, why, they are pictures. If no, why, they encode a language.

                          Hieroglyphics encode a (dead) language. There are different variations on the glyphs depending on who drew them (and what century they lived in) and so they share the property that there is a tight(ish, modulo a few thousand years of drift) coupling between an abstract hieroglyph and meaning and a loose coupling between that abstract hieroglyph and a concrete image that represents it. Recording them as text is useful for processing them because you want to extract the abstract characters and process them.

                          The same is true of Chinese (though traditional vs simplified made this a bit more complex and the unicode decisions to represent Kanji and Chinese text using the same code points has complicated things somewhat): you can draw the individual characters in different ways (within certain constraints) and convey the same meaning.

                          In contrast, emoji do not convey abstract meaning, they are tightly coupled to the images that are used to represent them. This was demonstrated very clearly by the pistol debacle. Apple decided that a real pistol image was bad because it was used in harassment and decided to replace the image that they rendered with a water pistol. This led to the exact same string being represented by glyphs that conveyed totally different meaning. This is because the glyph not the character encodes meaning for emoji. If you parsed the string as text, there is no possible way of extracting meaning without also knowing the font that is used.

                          Since the image is the meaningful bit, not the character, we should store these things as images and use any of the hundreds of images-and-text formats that we already have.

                          More pragmatically: unicode represents writing schemes. If a set of images have acquired a significant semantic meaning over time, then they may count as a writing system and so can be included. Instead, things are being added in the emoji space as new things that no one is using yet, to try to define a writing scheme (largely for marketing reasons, so that ‘100 new emoji!’ can be a bullet point on new Android or iOS releases).

                          It might be a shallow dissimilar, but people trying to tell others what forms of writing text are worthy of being supported by text rendering pipelines gets me going.

                          It’s not just (or even mostly) about the rendering pipelines (though it is annoying there because emoji are totally unlike anything else and have required entirely new feature to be added to font formats to support them), it’s about all of the other things that process text. A core idea of unicode is that text has meaningful semantics distinct from the glyps that they represent. Text is a serialisation of language and can be used to process that language in a somewhat abstract representation. What, aside from rendering, can you do with processing of emoji as text that is useful? Can you sort them according to the current locale meaningfully, for example (seriously, how should 🐕 and 🍆 be sorted - they’re in Unicode and so that has to be specified for every locale)? Can you translate them into a different language? Can you extract phonemes from them? Can you, in fact, do anything useful with them that you couldn’t do if you embedded them as images with alt text?

                          1. 11

                            Statistically, no-one cares about hieroglyphics, but lots of people care about being able to preserve emojis intact. So text rendering pipelines need to deal with emojis, which means we get proper hieroglyphics (and other Unicode) “for free”.

                            Plus, being responsible for emoji gives the Unicode Consortium the sort of PR coverage most organizations spend billions to achieve. If this helps them get even more ancient writing systems implemented, it’s a net good.

                            1. 2

                              What, aside from rendering, can you do with processing of emoji as text that is useful?

                              Today, I s/☑️/✅/g a text file.

                              Can you sort them according to the current locale meaningfully, for example (seriously, how should 🐕 and 🍆 be sorted - they’re in Unicode and so that has to be specified for every locale)?

                              Do I have the book for you!

                              Can you translate them into a different language? Can you extract phonemes from them?

                              We can’t even do that with a lot of text! 😝

                          2. 8

                            At this point, there’s a strong argument for a new character set that is a subset of unicode that removes all of the things that are not text.

                            All that’s missing from this sentence to set off all the 🚩 🚩 🚩 is a word like “just” or “simply”.

                            Others have started poking at your definition of “text”, and are correct to do so – are hieroglyphs “text”? how about ideograms? logograms? – but really the problem is that while you may feel you have a consistent rule for demarcating “text” from “images” (or any other “not text” things), standards require getting a bunch of other people to agree with your rule. And that’s going to be difficult, because any such rule will be arbitrary. Yours, for example, mostly seem to count certain very-image-like things as “text” if they’ve been around long enough (Chinese logograms, Egyptian hieroglyphics) while counting other newer ones as “not text” (emoji). So one might reasonably ask you where the line is: how old does the usage have to be in order to make the jump from “image” to “text”? And since you seem to be fixated on a requirement that emoji should always render the same on every platform, what are you going to do about all the variant letter and letter-like characters that are already in Unicode? Do we really need both U+03A9 GREEK LETTER CAPITAL OMEGA and U+2126 OHM SIGN?

                            etc.

                            1. 1

                              So one might reasonably ask you where the line is: how old does the usage have to be in order to make the jump from “image” to “text”?

                              Do they serialise language? They’re text. Emoji are not a writing system. They might be a precursor to a writing system (most ideographic writing systems started with pictures and were then formalised) but that doesn’t happen until people ascribe common meaning to them beyond ‘this is a picture of X’.

                              And since you seem to be fixated on a requirement that emoji should always render the same on every platform, what are you going to do about all the variant letter and letter-like characters that are already in Unicode?

                              That’s the opposite of my point. Unicode code points represent an abstraction. They are not supposed to require an exact glyph. There are some things in Unicode to allow lossless round tripping through existing character encodings that could be represented as sequences of combining diacritics. They’re not idea in a pure-Unicode world but they are essential for Unicode’s purpose: being able to represent all text in a form amenable to processing.

                              For each character, there is a large space of possible glyphs that a reader will recognise. The letter A might be anything from a monospaced block character to a curved illustrated drop character from an illuminated manuscript. The picture is not closely coupled to the meaning and changing the picture within that space does not alter the semantics. Emoji do not have that property. They cause confusion when slightly different glyphs are used. Buzzfeed and similar places are full of ‘funny’ exchanges from people interpreting emoji differently, often because they see slightly different glyphs.

                              The way that emoji are used assumes that the receiver of a message will see exactly the same glyph that the sender sends. That isn’t necessary for any writing system. If I send Unicode of English, Greek, Icelandic, Chinese, or ancient Egyptian, the reader’s understanding will not change if they change fonts (as long as the fonts don’t omit glyphs for characters in that space). If someone sends a Unicode message containing emoji, they don’t have that guarantee because there is no abstract semantics associated with them. I send a picture of a dog, you see a different dog, I make a reference to a feature of that dog and that feature isn’t present in your font, you are confused. Non-geeks in my acquaintance refer to them as ‘little pictures’ and think of them in the same way as embedded GIFs. Treating them as characters causes problems but does not solve any problems.

                              1. 2

                                Do they serialise language? They’re text. Emoji are not a writing system. They might be a precursor to a writing system (most ideographic writing systems started with pictures and were then formalised) but that doesn’t happen until people ascribe common meaning to them beyond ‘this is a picture of X’.

                                I think this is going to end up being a far deeper and more complex rabbit hole than the tone of your comment anticipates. Plenty of things that are in Unicode today, and that you undoubtedly would consider to be “text”, do not hold up to this criterion.

                                For example, any character that has regional/dialect/language-specific variations in pronunciation seems to be right out by your rules. So consider, say, Spanish, where in some dialects the sound of something like the “c” in “Barcelona” is /s/ and in others it’s /θ/. It seems hard to say that speakers of different dialects agree on what that character stands for.

                            2. 4

                              At this point, I feel like the cat is out of the bag; people are used to being able to use emoji in almost any text-entry context. Text rendering pipelines are now stuck supporting these icons. With that being the case, wouldn’t it be way more complexity to layer another parsing scheme on top of Unicode in order to represent emoji? I can see the argument that they shouldn’t have been put in there in the first place, but it doesn’t seem like it would be worth it to try to remove them now that they’re already there.

                            3. 10

                              The choice is very easy for me, and unfortunately it has very little do with the merits of the actual language, and a lot to do with tooling and implementation.

                              When I try and setup a haskell environment (following https://www.haskell.org/ghcup/), ghci segfaults.

                              Ocaml tooling is much better, but needs 128 bits of space to pack two 32 bit ints into a single record.

                              So I use Rust. Top notch tooling, hassle free to install, memory layout is very clear. To be honest I don’t like the C-style syntax, and have no issues with garbage collection. But I value a hassle free tooling experience and control over memory layout too highly.

                              1. 10

                                Do you suppose you could open a ticket for your GHCi issue? I would be very interested to know more about your case.

                                1. 1

                                  You’re right I should make a ticket - but I doubt that’s the place to do it?

                                  Presumably what happened is the installer shell script fetched the wrong binary, or failed to detect it had no binary for my system.

                                  1. 6

                                    @bgamari is one of the primary maintainers for both GHCup and ghci, so that place will do just fine!

                                    1. 1

                                      Oh, I realised you need to sign up for an account on that GitLab page, which is a rather high barrier to filing tickets. If you want to file your issue at https://github.com/tomjaguarpaw/tilapia/issues/new then I will make sure it gets to the right place.

                                      1. 2

                                        How is the barrier any higher than signing up for a Microsoft GitHub account?

                                        1. 1

                                          Sign ups to the Haskell GitLab must be manually approved IIRC. Approvals often go astray, get delayed, and the experience is not conducive to capturing input from a tangentially-interested submitter. (At least it used to be that way. I’m not sure if it’s changed.)

                                          1. 1

                                            Eesh. That sounds like a bad experience.

                                            1. 2

                                              It is, but such is life on a volunteer-run self-hosted instance.

                                              Disclaimer: my experience is dated, things may be better now.

                                              1. 2

                                                Seems GitLab wants to try federation now which could possibly help the situation

                                  2. 6

                                    Ocaml tooling is much better, but needs 128 bits of space to pack two 32 bit ints into a single record.

                                    Is there a particular domain you are working where boxing is particularly galling?

                                    1. 4

                                      This should be doable when unboxed types become available! People are definitely working on it.

                                      1. 2

                                        I had no idea this was a thing! that’s great news.

                                    2. 3

                                      One big problem I’ve been running into with Rust is that control over memory layout is lacking in many ways. Especially the ergonomics around tagging pointers, controlling enum niches, and creating sparse struct of arrays (which in some cases is impossible to do in a way Zig does it).

                                      Writing a compiler confirms my observed pain points about Rust for writing high-performance code.

                                      1. 1

                                        yeah Zig really is the language to beat when it comes to fine grained and ergonomic control of memory.

                                        Still rust is leagues ahead of ocaml and haskell here.

                                      1. 9

                                        When will people learn? 30 years of Symonyi’s work, the countless hours put into IntelliJ’s MPS and Eclipse’s XText, half of Bracha’s thankless toiling, all ended up being ignored. Structural editing needs to operate “in the background” the way treesitter does. People are used to editing text. Let people mangle the text how they see fit, and just update your model when it parses. I’m the first in line to rail against plain text as the storage format and API of a programming language, but it is the best user interface for actually writing code when you’re trying to get ideas out of your head into the computer. It doesn’t have to be plain monospaced bollocks though, let people format it, set their own colours for bindings, insert tables, put images in comments, etc.

                                        1. 11

                                          People are used to editing text

                                          No one ever made progress by giving people more of what they already had.

                                          it is the best user interface for actually writing code when you’re trying to get ideas out of your head

                                          It is easier to learn text. And you already know text. That doesn’t mean it is the best ui. I also maintain that there haven’t been any really good structural editors yet, though fructure and hazel look interesting—real advantage comes from large-scale manipulation and analysis (incl. programmatic—take semgrep-type as one random example, but can be much more powerful and simpler; also smalltalk class browser). Text-based interfaces may converge on something in the ballpark of what you can do with pure structure eventually, but at the cost of much pointless complexity and—still—some power (eg presentation types for ast nodes), and giving back nothing in exchange but familiarity and a bit of ease of learning. If what you want is to get ideas out of your head, code may not be the most appropriate medium—that’s fine. I often work out ideas with pen and paper (NB. not text either!).

                                          1. 4

                                            I don’t think that maintaining bidirectional mapping between text and AST is complex, it’s just little known how to do that. If you factor the cost of associated tooling which just works with text and needs to be re-implemented for each non-text language (editor, vcs, online forge), it seems that working with text is a couple of orders of magnitude less complex?

                                            My impression is that JetBrains basically figured out how to do syntactic and semantic aware coding in early aughts, and then everyone ignored this cluster of ideas for practical usage until LSP and Tree-sitter came along.

                                            1. 6

                                              AST

                                              ASTs are simple and hierarchical. If the goal is to support complex, graphical relationships, this becomes much more challenging. You would at the very least need weak pointers and in-band signalling—bad things, both.

                                              editor, vcs, online forge

                                              Editor I have argued is simplified by operating only on structure; unless you disagree, leave that aside. The basic function of a vcs can work just the same on opaque files (arbitrary serialisation format—necessary, but incidental and trivial); patch/diff you want to be language-aware anyway. Online forge I don’t find too interesting—in particular, its functions should be subsumed by the editing environment; e.g., you should be able to have strong pointers between tickets and code elements—but for the ‘online’ part, you may perhaps compile your editor to javascript or wasm.

                                              More broadly, conceptually unintegrated (cf brooks) programming environments are bad; generic tooling is going to be suboptimal anyway. The preponderance of independent, disconnected programming languages and environments is an unfortunate accident that would not have been necessary given better abstractions, though it may not have been avoidable (see gabriel): the hegemony of mediocre unix meant that it was necessary to make better programming languages on top of it, but they could only interoperate at the level of its abstractions, which were limited to flat, inexpressive text, which was of necessity somewhat removed from the far richer and more useful structural abstractions used within those languages. A better-abstracted and better-constructed system would admit structural interoperation.

                                            2. 3

                                              Note that Sophistifunk is talking about writing code while also maintaining a model which isn’t plain text. In that scenario, structured manipulation across the code base is still possible via the model alongside having localised “freeform” text entry. Personally I lean towards your view, but the freeform entry direction is also worth exploring I think.

                                              1. 4

                                                writing code while also maintaining a model which isn’t plain text

                                                Yes—‘projectional editing’. But it becomes increasingly cumbersome to maintain the textual mapping—hence ‘much pointless complexity’—if you want to go both ways. If it’s a one-way mapping—write a new function as text, compile it to structure and integrate it with the running system, and manipulate it thus thereafter—then fine, I suppose, but also kind of pointless imo. I’ll refer also to a brief exchange on the matter I had with geoff langdale.

                                                1. 2

                                                  Taking this further: I find it completely reasonable that a good structured editor might support complex transactions which can not easily be decomposed into structurally and semantically coherent small-steps. But that does not at all mean that textual manipulation is the best way to implement those transactions.

                                              2. 5

                                                When will people learn?

                                                I’d love it if I learned more about this right now :)

                                                I’ve learned about structural editing recently, and I’d love pointers to good resources.

                                                Structural editing needs to operate “in the background” the way treesitter does.

                                                What do you mean by this?

                                                1. 5

                                                  What I mean is that you maintain a structured model of the code that is used to drive the rest of your process, and also to derive useful information to help the human editing the code, but don’t restrict the user to operations that are atomic with respect to maintaining a correct model. You can allow the user all the wonderful model-aware operations should they chose to trigger one, but don’t force them to do so. If they just want to mess with the text let them do it, and simply maintain your last-known-good model of the text while they do so. When the text once again becomes parseable, then you update your model and re-run any dependent inspections / start showing errors / whatever you would normally update if they had made the change wholesale as an atomic operation.

                                                  In structural editors where it’s touted that the user “can’t put the source into an invalid state” it means that the user needs to already know the shortcuts and/or the names for whatever each step would be to get the text from where it is now to what they want it to be, which can and often does present a serious extra cognitive load on the user, as well as often requiring a lot more steps to get from A to B, if the program needs to be at least syntactically valid at each point during the transition.

                                                  1. 1

                                                    Hello I am making a multi-language structural editor.

                                                    the user needs to already know the shortcuts and/or the names for whatever each step would be to get the text from where it is now to what they want it to be

                                                    Indeed, while text can be edited with “left, right, up, down, backspace, newline, characters”, trees have more structure and so require more basic commands. Mapping it out, having a set of commands that’s about as powerful as the basic Vim commands seems to require roughly the same number of keys, with related but tree-y functionality. Here’s a sketch for a Vim-like key map (just the first set; everything after “——” is old notes):

                                                    https://github.com/justinpombrio/synless/blob/master/doc/commands.txt

                                                    There’s one thing you need to know while editing that isn’t present in the key map: when inserting a node, you need to say what node type it is (e.g. “+” or “function” or “if”). Our plan is that there’s a single key for each (e.g. “+”, “f”, “i”) that is shown on screen when you go to insert, to make them discoverable. This should actually increase discoverability compared to a text editor, when editing a language you’re not very familiar with.

                                                    as well as often requiring a lot more steps to get from A to B, if the program needs to be at least syntactically valid at each point during the transition

                                                    I’ve heard this sentiment a lot, but haven’t heard of good examples. Can you think of a few? I’m very curious what obstacles there are.

                                                    One important thing is that — at least in the structural editor I’m making — is that there can be “holes” in the document. For example, showing a hole as ?, an if/else whose condition hasn’t been filled in would be if (?) { x += 1; } else { x -= 1; }. So there’s a little bit of allowed syntactic invalidity that might make it easier to traverse between syntactically valid states.

                                                    1. 2

                                                      I’ve heard this sentiment a lot, but haven’t heard of good examples. Can you think of a few?

                                                      Any time in your life you’ve been moving things around and the text has been an invalid program for a few minutes. Every single time that’s happened, you’d have had to stop and re-think not just “what do I want the code to look like when I’m done with this edit?” but also “how do I get to where I want to be using only the allowable operations within this editor?”. And perhaps when you’re the one building the structural editor this seems like it’s not a burden, but as somebody who needs to do stuff like this all the time, I can assure you it’s not. I’m not interested in adding any barriers to my job, I want tools to help me, to work for me, not constrict me in order to make their own implementation easier.

                                                      1. 3

                                                        Any time in your life you’ve been moving things around and the text has been an invalid program for a few minutes.

                                                        Can you be more specific, like with actual code examples? Because I move things around all the time, and the program tends to stay syntactically valid the whole time. Or rather, it sometimes becomes invalid but only because of silly punctuation things that a structural editor would take care of. Some examples:

                                                        Example 1: swapping two list elems in JSON by cutting and pasting.

                                                        In a text editor:

                                                        [
                                                            elem_1,
                                                            elem_2,
                                                            elem_3
                                                        ]
                                                        -->
                                                        [
                                                            elem_3 // invalid: missing comma
                                                            elem_2,
                                                            elem_1, // invalid: comma not allowed
                                                        ]
                                                        

                                                        In a structural editor, you cut the third element and paste it before the first element, then cut the first element and paste it after the second element, and at all five of these steps the commas are correct.

                                                        Example 2: wrapping some stuff in a function.

                                                        Say you realize that some code ought to be inside a function so you can re-use it, and you write half the function but don’t finish it:

                                                        fn big_function() {
                                                            ...
                                                            forward_map.insert(key, val);
                                                            reverse_map.insert(val, key);
                                                            ...
                                                        }
                                                        -->
                                                        // syntactically invalid because key doesn't have a type
                                                        fn helper_function(key) {
                                                            forward_map.insert(key, val);
                                                            reverse_map.insert(val, key);
                                                        }
                                                        fn big_function() {
                                                            ...
                                                        }
                                                        

                                                        with holes, this is actually “syntactically valid”:

                                                        fn helper_function(key: ?) {
                                                            forward_map.insert(key, val);
                                                            reverse_map.insert(val, key);
                                                        }
                                                        fn big_function() {
                                                            ...
                                                        }
                                                        

                                                        I like to keep my program syntactically valid, because my auto-formatter only works when the program parses. So it’s possible that you edit differently than I do, going through more invalid states. If so, I’m very curious what states those are.

                                                        A wise man once said that critics are your best friends because, unlike your fans, they help you improve your work with constructive feedback. Won’t you be my friend?

                                                  2. 2

                                                    I’ve learned about structural editing recently, and I’d love pointers to good resources.

                                                    Check out hazel and associated papers!

                                                    1. 2

                                                      Hazel is a great project.

                                                      Another interesting one to look at is https://tylr.fun/ very pretty

                                                  3. 4

                                                    Uhu, I also tend to think that anything you can do in a structural editor, you can do in a normal editor as well, plus you’ll get all nice pure-text editing features (e.g, multiple cursors) for free.

                                                    The only issue here is that for this to work you also need a parser which doesn’t produce a train wreck of an AST if there’s a missing semicolon, and there’s frustratingly little information on how to do that. https://matklad.github.io/2023/05/21/resilient-ll-parsing-tutorial.html explains the trick.

                                                    1. 3

                                                      People are used to editing text

                                                      All of them? Speaking for at least one person I like editing the structure directly.

                                                      Probably I should be clearer. I’m building the language for myself and (if there are any) like minded people. The project (language & editor) are optimised for exploring ground that is new and interesting to me and not for mass appeal. For potential mainstream appeal I point people at https://gleam.run/ anyway.

                                                      This point is made slightly clearer in the project README. https://github.com/CrowdHailer/eyg-lang/

                                                    2. 1

                                                      https://www.causalislands.com/ Future of computing conference in Toronto next month!

                                                      1. 1

                                                        What’s the probability this standard deviation occurs?

                                                          1. 5

                                                            Weekdays may be stable through the millennia, but calendars come and go. It’s 27 Dec 2022 according to the Julian calendar today.

                                                        1. 5

                                                          I think Ada could be just the coolest language with a modern ecosystem. Alire the cargo-like package manager is a great step in that direction!

                                                          1. 3

                                                            Deno showed a lot of progress, and still has a lot of progress but I’m a little disappointed with how much it has seemingly lagged. As soon as I tried to use it for any web development I always ran into how pervasive node is and how non-standard most Typescript implementations are. I forgot the details but I was using an Adobe design framework that had a Typescript implementation but deno was pretty strict and the exported modules were not doing something in the language spec.

                                                            It was a huge hindrance to adoption and ultimately I just went back to using golang for the backend and nodejs for the frontend. I was hoping to use Typescript throughout, it would have been a game changer but to do so you need to really roll your own frameworks. Hopefully someone will come up with a solution that lets you use legacy frameworks seamlessly with Deno.

                                                            Svelte was another thing I couldn’t use so hopefully this solves that, but there’s so much out there that isn’t supported it seems like one framework or library you need means pretty much either taking on the burden of writing or introducing the mess of nodejs into the project and at that point might as well just go nodejs all the way.

                                                            1. 3

                                                              Not sure if you’ve seen this, it outlines Deno’s recent intent to improve compatibility with node:

                                                              https://lobste.rs/s/gmhj3l/big_changes_ahead_for_deno

                                                              1. 1

                                                                It was a huge hindrance to adoption and ultimately I just went back to using golang for the backend and nodejs for the frontend.

                                                                I’m often confused at first at these types of comments, e.g. (“wait why/how would you use nodejs in front-end”), but then I realize what you meant.

                                                              2. 1

                                                                Very cool! Would love to see a photo of the charts with live view!

                                                                1. 2

                                                                  I think these “restarts” described the author are algebraic effects by another name. This might be a good overview: https://overreacted.io/algebraic-effects-for-the-rest-of-us/

                                                                  FWIW I am a big fan of effect systems and think they represent a path forward for a lot of the problems in programming: exceptions, concurrency, maybe even memory management.

                                                                  Weirdly, the title of the post mentions “Structurally-Typed” but that is not mentioned in the rest of the post. They’re right, though, you have to use structural typing to type effect systems like this. This article describes using a structural subtyping algorithm to type an effect system as an addition to OCaml.

                                                                  1. 5

                                                                    I have mixed feelings about this headline.

                                                                    I mean, it’s great, but seems like an inditement of our situation that $15k for a major project is somehow “a lot”, enough to become a headline.

                                                                    1. 2

                                                                      I agree.

                                                                      In case people are not familiar with urllib3: it’s one that powers Python’s requests library.

                                                                      1. 1

                                                                        On the contrary, if a person wants monetary compensation for their work, they shouldn’t give it away for free.

                                                                        If making money with software is the goal, “free as in beer” open source is the wrong way to go about it.

                                                                        That said, $15k can be great motivation to get people to work tricky bugs that would otherwise languish.

                                                                        1. 3

                                                                          You misread what you’re replying to. @singlepolyma was saying that $15k should be too little to make headlines, not that it’s too much.

                                                                      2. 11

                                                                        Hiya lobsters, if anyone is interested in working on open source, we have lots of “Contributor Friendly” tagged issues: https://github.com/urllib3/urllib3/issues

                                                                        We can even compensate for some kinds of issues! Pop into our Discord chat to discuss. :)

                                                                        1. 13

                                                                          Cool project but I really wish I didn’t have to use Discord.

                                                                          1. 12

                                                                            Maintaining a big open source project is hard enough as it is, this is the sweet spot for us right now. We’ve changed several chat platforms over the years (we used Gitter for a while, for example), who knows what will be next!

                                                                            1. 2

                                                                              Just out of curiosity, why is that?

                                                                          2. 3

                                                                            This is the first time I’ve wanted to be able to display graphics in my terminal…

                                                                            1. 10

                                                                              I’m working on my alternative frontend for Medium called Scribe and would love some people to kick the tires. Specifically looking to see if you run into anything that feels broken or missing.

                                                                              1. 6

                                                                                Looks cool! One idea: replace internal links to medium with their scribe equivalent.

                                                                                1. 5

                                                                                  Yeah, good idea. Why didn’t I think of that 😆

                                                                                2. 2

                                                                                  I need this in my life. I will try it when I hit the Medium brick wall.

                                                                                3. 2

                                                                                  I just learned about Orca from this snippet of the video and it is so cool — thanks for sharing!

                                                                                  1. 8

                                                                                    A while ago I hacked the ability to read files and write files into jq 🙈 Here’s some of the resulting code that writes to a bunch of files:

                                                                                    poems
                                                                                      | reverse
                                                                                      | add_numbers
                                                                                      | map(compile_single)
                                                                                      | [ write("index.html"; .[-1].content), map(write("poem_\(.number).html"; .content)) ]
                                                                                      | flatten
                                                                                    
                                                                                    1. 7

                                                                                      Suggest only the plt tag, since tags are filtered via intersection and not union. :)

                                                                                      1. 12

                                                                                        Tags should be partially ordered in a lattice, with “sub-tags” being substitutable for any “super-tags” higher up in the lattice.

                                                                                        1. 4

                                                                                          If we’re brainstorming, I’d suggest not ordering them initially and instead allowing people to use a blang to query them.

                                                                                          1. 2

                                                                                            There’s also noms which has been around longer.

                                                                                            1. 2

                                                                                              Dolt uses noms internally, but it doesn’t seem like noms is a ready end-user product. Dolt is a UI layer and workflow for end-users on top of noms.

                                                                                          2. 2

                                                                                            Does anyone know what IR this backend takes as input? Bucklescript has had difficulty staying up-to-date with the latest ocaml version I think in part because it consumes the initially parsed AST.

                                                                                            1. 2

                                                                                              Answering my own question… It looks like it takes a Typed Tree as input, so quite high level still, just after type checking.