unfortunately the ms folks locked the thread, what a pity. wonder if anyone’s sent the video to them yet :-)
Even though they might be wrong, leaving the thread open would only allow the randos on github pile abuse on them, and that’s not really necessary.
I hope that they can watch the video and introspect if the main reason they quote: code maintainability, really warrants the 2, 3 orders of magnitude slower code. I personally doubt it.
I really doubt it as well. if anything, Muratori’s code here is even more maintainable just by virtue of less lines of code— I think the only excuse that sort of holds water is super-backwards compatibility constraints, and even then I don’t know if that’s legit.
They’ve seen it, and they’re working on it, see (and please don’t spam…) : https://github.com/microsoft/terminal/issues/10461
Don’t really see why MS should care about terminal bandwidth.
The tl;dw of this is that Muratori implements a reference terminal that uses a tile renderer to render glyphs real fast whereas windows terminal is quite slow.
There’s maybe some moving of the goalposts going on here. In the comment Muratori made on Github, he alleged that his design could be done “in a weekend” and his design included correct ClearType handling.
Microsoft folk said that ClearType couldn’t be done with a glyph atlas approach, and, indeed, Muratori’s readme seems to say that ClearType is not handled properly yet (and would require information from DirectWrite that it does not give).
The point of the video is that trivial back of the envelope calculations show that almost everything is 100-1000x (IMO Casey is being generous with the upper bound here) slower than optimal, and that the default code that can be easily understood and probably even written by someone with less than one year of programming experience is 100x faster than code that cost millions of dollars and many dev-years to produce.
But in the specific case of terminal rendering, drawing text is one of the first things computers did. A $30 Android phone is more powerful than all the computers in the world combined when we first did it, and now rendering text at 60fps is almost universally considered to be too hard.
On subpixel rendering: Casey talks about it in the github issue a bit IIRC. You have a fixed number of foreground/background colour combinations and 99% of the time it’s 16 colours on the default background, so you can key the glyph cache on fg/bg and bake cleartyped glyphs into it and it works out for ~free.
Text is incredibly hard if you want to do it right. Obviously, the difficulties of vector fonts, complex scripts, RTL, now colour emoji, etc. But even back when we just had Latin characters, we had separate computers that did nothing but render text and handle input - physical terminals. The glass teletypes employed a lot of hacks to be useful; and of course, they were bottlenecked by their serial lines and couldn’t do bitmaps.
Rendering text is hard, always has been, and now there’s more to do - less shortcuts.
It is hard, especially in the general case, but when you have monospaced text with one font size and outsource text shaping/glyph rendering to directwrite most of the hard problems go away.
Alternatively, Slug does everything for you and runs at 1000+ fps and is available for an amount of money that rounds to zero for a company like Microsoft.
Slug is very cool, thanks for linking to it!
Greater than 1000 fps might be overstating it unless you have experience with Slug. In the original paper, Section 5, they report 1.1ms to fill a 2 megapixel area with just 50 lines of Arial and from the algorithm description, more smaller text may take a longer time, but probably still fast enough for 100s of fps.
The clever things that Slug does don’t matter if you’re not doing geometric transformations on the glyphs anyway, but it would be nice to be nice for VR terminals! (Slug also doesn’t do font fallback, and doing that right may or may not be tricky)
There’s a criticism of some of the tradeoffs, that it uses significant resources when idle, needs 10-15x more RAM, and doesn’t sanitize input properly.
I wish I had more upvotes to give this. Whenever Muratori or especially Blow go off on “you’re making bloated software!”, their simplicity is.achieved by either ignoring domain complexity (i.e. usually most of this just goes for pure ASCII, maybe subset of Unicode) or punting on other metrics (in this case, security, memory usage, idling, etc.). I don’t care that my terminal emulator is 5% faster if my GPU fan kicks in.
It’s basically suckless for Windowsy Direct3D gamedevs.
I wonder why Terminals don’t entirely turn off the rendering past a certain speed limit. There is some value in seeing patterns fly by, but at some speed, even patterns become indistinguishable. The whole screen might as well be replaced by a bandwidth indicator until the buffer is closed or slowed down.
This is exactly what my own terminal emulator implementation does for that very reason - the point is to read the output, not just to render a blur. Latency matters - lag when you’re typing is really annoying - but throughput doesn’t need to be rendered at all.
I’m having a hard time getting my head around why the performance of this particular thing is actually considered a problem by anyone. Even the “slow” demo (that takes 5:30 to dump a 1G file to the terminal) is scrolling way too fast for any human being to even attempt to read it. If you actually want to see that you’ll dump it to a log file and look at it in something you can keep up with. What is the supposed benefit of this being faster?
I mean, if nothing else, terminals should be able to go “Ok, this is too fast to keep up with”, and do some sort of frame skipping. If a terminal insists on displaying every line of output in some fashion, then speed is important, IMO.
I think the 1GB file is just a performance metric.
I tested this on st: just cat 1gb-file, and it has roughly the same performance. I can’t say I ever ran in to this issue, but maybe it’s an issue if you do something like Dwarf Fortress with lots of updates? I don’t know, and arguably the terminal is kind of a poor rendering stack for these kind of things in the first place for many reasons not related to performance.
Sometimes, I accidentally log something long to a terminal that I still want to be able to use (perhaps I want to re-enter the command that I entered by pressing ‘up’, then adding --quiet).
I have had slow terminals lock up and be unable to e.g. forward a ctrl-c to the child process (or a child process that ignores signals).
Using a fast terminal means that it only becomes unresponsive momentarily, instead of appearing stuck for an indeterminate time.
This seems like a separate issue from rendering throughput; if the terminal is taking a long time to respond to input then that is indeed a problem, but I don’t think that’s what was at issue here.