It gets the key aspect, which I think is missing from the linked article:
Every component is a buffer/editor
This is also a core philosophy of Emacs and Vim, however in the recent modal editors such as Kakoune, Neovim, and Helix, they took another approach (the standard GUI approach) where every component is different.
VS Code/Zed are quite different beasts as compared to Emacs/acme. They are GUIs, with one widget being (a very good implementation of) rich text text area. In contrast, Emacs is a rich text text area, and the GUI is build using text as a building block (using the trivial, but powerful idea of selecting & clicking text as the main interaction method).
The fact that Ki derides Nevoim, Kakoune, and Helix for not being text enough gives me hope!
Yeah I’ve had a couple of people point me towards Ki in issues on ad in relation to how it uses tree-sitter to power motions and selections. It sounds like a lot of fun! Unless I’m misunderstanding though I think its taking a different approach to what the fundamental building block for a buffer is? Namely a tree-sitter AST rather than plaintext. I can see the power of that for when you are working with buffers containing files written in languages that have tree-sitter support but presumably the functionality of the editor then varies depending on what the filetype is.
Its obviously less of a concern if you don’t also want to have the acme-like extensibility I’m aiming for in ad where you are ending up with lots of buffers containing custom text content that won’t map to an existing tree-sitter grammar.
The ideas of using a structured buffer and being able to work with plaintext normally are not incompatible. In a way the challenge of editing plain text is basically no different than the challenge of editing text inside a string, just without some of the edge cases that strings involve.
My understanding as regards tree-sitter is that tree-sitter is not designed to be the primary backing state for an editor, but rather that it is meant to sit alongside a rope data structure of some kind which is considered the source of truth for tree-sitter’s incremental re-parsing.
I’ve tackled the same problem but in a completely different way: instead having flat text next to tree structure I put the text and the AST structure together in one tree.
I suppose that’s what I’m getting at really: if you use this approach then your “base layer” is a tree, not a plaintext buffer. There’s no problem with that, but it really does alter how you set up the rest of the abstractions in the editor / user interface.
ad might not be the best example of how to tackle this in a “what would your ideal solution look like?” kind of way, but the intent is to be transparent about the fact that all you are guaranteed is that the contents of each buffer is valid utf8 encoded text. Everything else is built on the idea of selecting contiguous ranges within that text and then making edits, or acme style executing and loading. That way you can dump the output of any unix-y “write text to stdout” program back into a buffer and retain the same interface (as opposed to having to identify the type of buffer you are working with and interpreting the content accordingly).
I designed CSTML around the same guarantee: that contents of a document tree can always be expressed in terms of a single flat stream of utf-8 encoded text. This is very different than a language like JSON which makes no such guarantee that the structure it describes is superimposed onto (“interleaved with the content of” would be more accurate) what is essentially a text buffer.
A CSTML-outputting parser can be thought of as consuming input while producing an output which always contains the input embedded in it. Embedding the input allows us to attach all kinds of useful metadata on the embedded representation. This is not unlike a paragraph in HTML, which can be understood to have a single textual representation even as something like a <span> is used to denote certain structure within the flat text.
The end result is that you’re guaranteed to be able to select contiguous ranges of text (without knowing anything at all about what language is contained in a document), and you are also always able to know if your selection corresponds to any structures recognized by the parser. If no structures are recognized, you’ll simply appear to be editing one big flat text file. I have taken care to ensure that the system remains performant in this case.
By ‘text focused’ here I mean things that want primarily to display text and have some controls and user interface elements
The text part feels like a distraction here. Isn’t mutability and extensibility the principle goal? What if you had a Smalltalk that was natively integrated instead of being isolated and had a really nice text editor widget that you could plug in where you want? I feel like PowerShell was a major lost opportunity. If they had produced a Smalltalk style system that interoperated with .NET, it would have filled a lot of these gaps. Instead we got a sort-of Unix shell.
My view is that people are choosing environments like Electron and GNU Emacs because they want to deal with a relatively straightforward and simple programming environment for their text-focused applications. I don’t think you can get this simplicity in a general purpose GUI programming environment; the very flexibility such an environment needs cuts against the simplicity that people want. Constraining the environment to be text-focused and to not have API features that go much beyond that is an important path to making it straightforward and simple enough to attract people away from existing things like GNU Emacs. Of course such a text-focused environment will probably be built on top of a more general purpose GUI environment, but I think that exposing very much of the underlying full API is probably a mistake if it’s even possible within the approach the text-focused environment takes.
I generally share your perspective and have been a very long-time Emacs user (both CLI and with a real GUI window). I still use Emacs as a code editor but in the last 6 months or so have mostly migrated from org-mode to Obsidian.md for day-to-day notetaking, journalling/work logs, sketching out architecture diagrams (using text-based languages like Mermaid or Pikchr), etc. It seems to strike a really great balance between still being very text focused while also handling UI stuff very effectively.
When I first started using it I was using the Canvas feature pretty heavily. I still use it when I’m brainstorming or mapping out a complex problem but have definitely shifted back towards mostly just straight text (Markdown) documents.
One thing that I am really curious about is how effective it might be to try mixing a code editor and the infinite canvas idea. Being able to dynamically visualize the interactions between different source files and navigate around that map has a lot of potential, I think, but I haven’t had enough spare time to actually try prototyping it out. Additionally I’ve always had a bit of a soft spot for Literate Programming but haven’t generally found an environment that I’m happy enough to commit to (org-babel is… ok, but kind of awkward).
This echoes something I’ve been saying for a while, although ironically it also sounds like the opposite. I’ve been saying we have standards for text (TUI) and interactive documents (HTML) but nothing for direct GUIs, which are entirely platform-dependent.
Everything being HTML really isn’t ground-breaking. Microsoft themselves figured it would happen, which is why they experimented with HTML-based applications in Windows Neptune. It really just makes sense, HTML is the mecca of human-focused digital documents, it fits every medium, solves every purpose.
I think the comments about VS Code is a little naive to it’s history and intent, though. VS Code is, of course, inspired by Atom. And both editors were designed to create websites, from day one. Sure, they worked as general-purpose text editors, kind of like Visual Studio proper. But it’s actual intent was for website creation, so of course they made it in a web browser. No other editor before or after has done the same, because they really don’t need to. Not that’s it a bad idea, it’s not at all, actually. It has a lot of benefits, like being embeddable in any website. But most people making a text editor don’t need that, so they don’t use HTML. It actually reminds me of what the Mozilla developers said about Firefox’s error reporting prompt. It can’t be an HTML page, because it has to tell the user what went wrong with the browser, nor would it benefit from being HTML anyway. So they just discovered the lowest common denominator API the error reporting system needs, and just made a really dumbed-down GUI API that wrapped around all the common ones (GTK, Qt, WinAPI, Cocoa). Most of the time, that’s really all you need!
It’s still in the early stages, but this is pretty much the goal of the ad editor that I’m working on: https://github.com/sminez/ad
Its also something that Matklad has been thinking about here: https://github.com/matklad/abont
I am also looking at the ki editor: https://ki-editor.github.io
It gets the key aspect, which I think is missing from the linked article:
VS Code/Zed are quite different beasts as compared to Emacs/acme. They are GUIs, with one widget being (a very good implementation of) rich text text area. In contrast, Emacs is a rich text text area, and the GUI is build using text as a building block (using the trivial, but powerful idea of selecting & clicking text as the main interaction method).
The fact that Ki derides Nevoim, Kakoune, and Helix for not being text enough gives me hope!
The Ki editor link above returns a 404 response for me.
Link to the Ki editor homepage: https://ki-editor.github.io/ki-editor/
Yeah I’ve had a couple of people point me towards Ki in issues on ad in relation to how it uses tree-sitter to power motions and selections. It sounds like a lot of fun! Unless I’m misunderstanding though I think its taking a different approach to what the fundamental building block for a buffer is? Namely a tree-sitter AST rather than plaintext. I can see the power of that for when you are working with buffers containing files written in languages that have tree-sitter support but presumably the functionality of the editor then varies depending on what the filetype is.
Its obviously less of a concern if you don’t also want to have the acme-like extensibility I’m aiming for in ad where you are ending up with lots of buffers containing custom text content that won’t map to an existing tree-sitter grammar.
The ideas of using a structured buffer and being able to work with plaintext normally are not incompatible. In a way the challenge of editing plain text is basically no different than the challenge of editing text inside a string, just without some of the edge cases that strings involve.
My understanding as regards tree-sitter is that tree-sitter is not designed to be the primary backing state for an editor, but rather that it is meant to sit alongside a rope data structure of some kind which is considered the source of truth for tree-sitter’s incremental re-parsing.
I’ve tackled the same problem but in a completely different way: instead having flat text next to tree structure I put the text and the AST structure together in one tree.
I suppose that’s what I’m getting at really: if you use this approach then your “base layer” is a tree, not a plaintext buffer. There’s no problem with that, but it really does alter how you set up the rest of the abstractions in the editor / user interface.
ad might not be the best example of how to tackle this in a “what would your ideal solution look like?” kind of way, but the intent is to be transparent about the fact that all you are guaranteed is that the contents of each buffer is valid utf8 encoded text. Everything else is built on the idea of selecting contiguous ranges within that text and then making edits, or acme style executing and loading. That way you can dump the output of any unix-y “write text to stdout” program back into a buffer and retain the same interface (as opposed to having to identify the type of buffer you are working with and interpreting the content accordingly).
I designed CSTML around the same guarantee: that contents of a document tree can always be expressed in terms of a single flat stream of utf-8 encoded text. This is very different than a language like JSON which makes no such guarantee that the structure it describes is superimposed onto (“interleaved with the content of” would be more accurate) what is essentially a text buffer.
A CSTML-outputting parser can be thought of as consuming input while producing an output which always contains the input embedded in it. Embedding the input allows us to attach all kinds of useful metadata on the embedded representation. This is not unlike a paragraph in HTML, which can be understood to have a single textual representation even as something like a
<span>is used to denote certain structure within the flat text.The end result is that you’re guaranteed to be able to select contiguous ranges of text (without knowing anything at all about what language is contained in a document), and you are also always able to know if your selection corresponds to any structures recognized by the parser. If no structures are recognized, you’ll simply appear to be editing one big flat text file. I have taken care to ensure that the system remains performant in this case.
The text part feels like a distraction here. Isn’t mutability and extensibility the principle goal? What if you had a Smalltalk that was natively integrated instead of being isolated and had a really nice text editor widget that you could plug in where you want? I feel like PowerShell was a major lost opportunity. If they had produced a Smalltalk style system that interoperated with .NET, it would have filled a lot of these gaps. Instead we got a sort-of Unix shell.
My view is that people are choosing environments like Electron and GNU Emacs because they want to deal with a relatively straightforward and simple programming environment for their text-focused applications. I don’t think you can get this simplicity in a general purpose GUI programming environment; the very flexibility such an environment needs cuts against the simplicity that people want. Constraining the environment to be text-focused and to not have API features that go much beyond that is an important path to making it straightforward and simple enough to attract people away from existing things like GNU Emacs. Of course such a text-focused environment will probably be built on top of a more general purpose GUI environment, but I think that exposing very much of the underlying full API is probably a mistake if it’s even possible within the approach the text-focused environment takes.
(I’m the author of the linked-to article.)
I generally share your perspective and have been a very long-time Emacs user (both CLI and with a real GUI window). I still use Emacs as a code editor but in the last 6 months or so have mostly migrated from org-mode to Obsidian.md for day-to-day notetaking, journalling/work logs, sketching out architecture diagrams (using text-based languages like Mermaid or Pikchr), etc. It seems to strike a really great balance between still being very text focused while also handling UI stuff very effectively.
When I first started using it I was using the Canvas feature pretty heavily. I still use it when I’m brainstorming or mapping out a complex problem but have definitely shifted back towards mostly just straight text (Markdown) documents.
One thing that I am really curious about is how effective it might be to try mixing a code editor and the infinite canvas idea. Being able to dynamically visualize the interactions between different source files and navigate around that map has a lot of potential, I think, but I haven’t had enough spare time to actually try prototyping it out. Additionally I’ve always had a bit of a soft spot for Literate Programming but haven’t generally found an environment that I’m happy enough to commit to (org-babel is… ok, but kind of awkward).
Another example of an application built on Emacs was Amazon’s early in-house customer service app:
https://sites.google.com/site/steveyegge2/tour-de-babel#h.p_ID_191
This echoes something I’ve been saying for a while, although ironically it also sounds like the opposite. I’ve been saying we have standards for text (TUI) and interactive documents (HTML) but nothing for direct GUIs, which are entirely platform-dependent.
Everything being HTML really isn’t ground-breaking. Microsoft themselves figured it would happen, which is why they experimented with HTML-based applications in Windows Neptune. It really just makes sense, HTML is the mecca of human-focused digital documents, it fits every medium, solves every purpose.
I think the comments about VS Code is a little naive to it’s history and intent, though. VS Code is, of course, inspired by Atom. And both editors were designed to create websites, from day one. Sure, they worked as general-purpose text editors, kind of like Visual Studio proper. But it’s actual intent was for website creation, so of course they made it in a web browser. No other editor before or after has done the same, because they really don’t need to. Not that’s it a bad idea, it’s not at all, actually. It has a lot of benefits, like being embeddable in any website. But most people making a text editor don’t need that, so they don’t use HTML. It actually reminds me of what the Mozilla developers said about Firefox’s error reporting prompt. It can’t be an HTML page, because it has to tell the user what went wrong with the browser, nor would it benefit from being HTML anyway. So they just discovered the lowest common denominator API the error reporting system needs, and just made a really dumbed-down GUI API that wrapped around all the common ones (GTK, Qt, WinAPI, Cocoa). Most of the time, that’s really all you need!
It’s like how cloud status pages need to be hosted on someone else’s cloud.