I want to write a machine code development tool.
I’ll write it in Common Lisp.
I need to be able to exit the program (nonportable) and send terminal control codes.
I spend a day researching all of the different exiting S-expressions before collecting them into a library.
I don’t like Ncurses or binding with C at all, so I’ll write my own terminal library.
Using this terminal library isn’t all that pleasant, so I’ll write an abstract terminal library on top of that.
That underlying terminal library doesn’t depend on ASCII, but does detect it and optimize for it, I could turn character-set detection into its own library.
This tool should be customizable, but I don’t care for arbitrary customization languages, so I’ll make it customizable in a machine code.
The target, CHIP-8, is entirely unsuited to this, so I’ll create a Meta-CHIP-8.
My Meta-CHIP-8 virtual machine needs to be able to access the primitive routines of the system in a well-designed fashion, so I must spend time designing each one where all behavior makes enough sense.
I don’t have a Meta-CHIP-8 development environment, so I’ll just use a hex editor.
This Common Lisp program, with its own machine code customization language, really isn’t that pleasant to read through, even though it’s efficient and works, sans a few minor flaws in some more advanced functionality I’ve simply yet to correct. It also consumes megabytes of memory, since it runs in a Common Lisp environment. I could rewrite it in chunks, again, and start removing Meta-CHIP-8, but I’m so exhausted by it at this point.
I’ll write a new version in Ada, which will use far less memory.
I’ll start learning Ada.
Now that I’ve largely learned Ada, I really should start implementing some of my Common Lisp libraries in Ada so I can enjoy some of the same abstractions.
That’s roughly where I’m at, currently. Some of this isn’t strictly hierarchical, but you get the idea. Something pleasant to do is occasionally take a break and work on something else, such as how I’ll be working on some server software in Common Lisp and Ada, to compare the two. However, I need my own networking library in both, since I’m dissatisfied with what’s available, and I dislike the Common Lisp multi-threading options, so I’ll need to write my own there. You understand the general trend.
That’s great. The CL and Ada thing especially. I once suggested on Schneier blog that mixing the most powerful and most constrained languages together might make for interesting experience. Maybe mock up Ada in CL with extraction-to-Ada tool. Then, you get CL’s rapid prototyping and metaprogramming with Ada’s safety checks and compiler performance. Never tried it since they were both pretty complicated. ZL guy is closest doing C/C++ in Scheme.
This multiple levels of escaping is nonsense, in my opinion, although this is nothing against the author.
I’m disappointed that a better option, my preferred option, wasn’t mentioned: The beginning and ending character can be doubled to provide for a single instance. Follows is an example:
In APL, this evaluates to a string with a single quote. Follows is another example:
In Ada, this is a string containing a single double quote.
Phrased differently, this is my preferred way to consider it, this is the simple idea that a string containing the string character itself can be represented by two strings juxtaposed, with the joining point becoming the string character in a new string that is the combination.
It’s disappointing Common Lisp uses escaping instead of this much nicer way.
I too prefer doubling the quote character over backslash + quote, since I personally find it more intuitive and aestetic.
However, this is still escaping, at least in the way I used the term in the article—the quote character does not represent itself but instead is a dispatch character. It will still suffer some downsides of escaping, e.g. copying the text between the start and end quote characters will not put the actual string in your clipboard.
The author appears to be aware of this method of escaping quotes:
In PostgreSQL, apostrophe (’) is escaped as two apostrophes (’’).
I first saw this kind of escaping in Plan 9’s rc(1):
A quoted word is a sequence of characters surrounded by single quotes (’). A single quote is represented in a quoted word by a pair of quotes (’’).
I dislike how Github is being used as a blog by so many; I dislike Github entirely, however, so that’s just one of many points I could make.
Use Tor. It’s telling the author never mentions Tor; I’m inclined to believe the author isn’t qualified to write about this topic.
Use Tor. It’s telling the author never mentions Tor; I’m inclined to believe the author isn’t qualified to write about this topic.
Tor requires more discipline to get what you want out of it, and many exit nodes are on block lists, making day to day browsing kind of troublesome.
VPN companies, on the other hand, often sponsor popular YouTube channels, and make other large ad buys, giving them reach far beyond tech audiences who might understand better how to work around some of the Tor problems. In other words, targeting VPN providers provides far greater bang for your, proverbial and literal, buck.
The amount of discipline required to “get what you want” out of Tor is all the same as the discipline required to get the same thing from any other VPN provider: anonymity in the face of a determined adversary. It’s just that Tor’s documentation and default configuration are designed for such an adversary, while most VPN’s are designed for an apathetic adversary that’s willing to store your IP, but not willing to deal with the hassle and potential false positives of fingerprinting.
Tor requires more discipline to get what you want out of it
If “what you want” == “just hide the dang home IP address”, it doesn’t require much. Just start it and set it as the proxy in Firefox. And, yes, it requires a bit more patience for the captchas and crap, but it’s really not that bad.
Well, people are told very vague “privacy” and “encrypt your internet” stuff but realistically, as a normal person (i.e. “Not Snowden”) you want two things
As some advice to the author, I originally mistook this for a submission of a much older article, since I didn’t look at the submission information; my point is that perhaps drastically different imagery would’ve prevented this, as I believe the article I mention used at least one of the same images. This is tangential and an arguable point, however.
Anyway, the article mentions UNIX and is tagged with that, but that doesn’t mean ASCII is actually related to UNIX.
The exact meaning of control characters has varied greatly over the years (which is why extensive termcap databases are required).
This article would be improved by mentioning ECMA-48 by name. Anyway, I was disappointed, but not surprised, to see the UNIX method portrayed as the only solution. Follows is an excerpt from ’‘The UNIX-HATERS Handbook’’:
As soon as more than one company started selling VDTs, software engineers faced an immediate problem: different manufacturers used different control sequences to accomplish similar functions. Programmers had to find a way to deal with the differences.
Programmers at the revered Digital Equipment Corporation took a very simple-minded approach to solving the heterogenous terminal problem. Since their company manufactured both hardware and software, they simply didn’t support terminals made by any other manufacturer. They then hard-coded algorithms for displaying information on the standard DEC VT52 (then the VT100, VT102, an so on) into their VMS operating system, application programs, scripts, mail messages, and any other system string that they could get their hands on.
At the MIT AI Laboratory, a different solution was developed. Instead of teaching each application program how to display information on the user’s screen, these algorithms were built into the ITS operating system itself. A special input/output subsystem within the Lab’s ITS kernel kept track of every character displayed on the user’s screen and automatically handled the differences between different terminals. Adding a new kind of terminal only required teaching ITS the terminal’s screen size, control characters, and operating characteristics, and suddenly every existing application would work on the new terminal without modification.
Unix (through the hand of Bill Joy) took a third approach. The techniques for manipulating a video display terminal were written and bundled together into a library, but then this library, instead of being linked into the kernel where it belonged (or put in a shared library), was linked with every single application program. When bugs were discovered in the so-called termcap library, the programs that were built from termcap had to be relinked (and occasionally recompiled). Because the screen was managed on a per-application basis, different applications couldn’t interoperate on the same screen. Instead, each one assumed that it had complete control (not a bad assumption, given the state of Unix at that time.) And, perhaps most importantly, the Unix kernel still thought that it was displaying information on a conventional teletype.
Now, returning back to your article:
This is kind of neat and well designed, but for us it means:
There is no way to see if the user pressed only Control or Shift, because from a terminal’s perspective all they do is modify a bit for the typed character.
Yes, this complicates advanced key-chords; a good test is seeing if Emacs differentiates; if not, then it’s unlikely it can reasonably be done.
There is no way to distinguish between the Tab key and Control+i. It’s not just ‘the same’ as Tab, Control+i is Tab.
Yes, I’d to explain this to one fellow in teaching him a design for a terminal control library I’ve written. This is another disadvantage of using control.
Sending Control with a character from the 2nd column is useless. Control clears the 7th bit, but this is already 0, so Control+# will just send “#”.
Yes, that’s the last main issue with it. It’s noteworthy that the Meta or Alt key avoids this issue; you can configure some terminals to set the eight bit or prefix with the Escape character, but those that can’t be configured use the latter convention; this is one of many issues with comprehensively parsing more advanced terminal input, but it does have the nice property of lacking these special cases the Control key has. Mentioning Meta or Alt would’ve perhaps been a good idea.
As some advice to the author, I originally mistook this for a submission of a much older article, since I didn’t look at the submission information; my point is that perhaps drastically different imagery would’ve prevented this, as I believe the article I mention used at least one of the same images.
I’m not sure which older article you mean? I just wrote this because it’s a common question/source of confusion, and frustration with how shit asciitable.com is (still first hit on Google :-/).
This article would be improved by mentioning ECMA-48 by name. Anyway, I was disappointed, but not surprised, to see the UNIX method portrayed as the only solution.
I linked to https://en.wikipedia.org/wiki/ANSI_escape_code for now.
The page isn’t intended to give a full and comprehensive overview of all the history; it explicitly mentions that “many aspects have been omitted”. My only goal was to provide exactly enough information for people to understand why CTRL+I sends a tab character, and why they can’t remap it in Vim etc. Nothing more.
This is also why it just talks about Unix. Unix still exists and is what many people use. VMS and ITS? Not so much…
Mentioning Meta or Alt would’ve perhaps been a good idea.
It’s mentioned briefly (“This is also how the Alt key works: Alt+a is
<Esc>a.”). To be honest, I was tempted to not even include the entire section about escape sequences at all, since it doesn’t directly pertain to the question I wanted to answer. I mainly included it to make sure people wouldn’t be confused and think F1 or the arrow keys are control codes.
This is an mirror of a post from John Carmack. Recently I learned that his articles on #AltDevBlog are no longer acessible. So, in order to archive them, I am re-posting them here. These articles are definitely good reads and worth to be preserved.
I wonder what the legality of that is.
Anyway, this makes me think about a practice I’m soon to begin, which is similar but not quite this parallel implementations. That is, having multiple implementations with different strengths, but written in different languages and so one is not intended to supplant the other, but more or less compare the two languages for the basic problem, likely with certain additions attuned to the strengths of each.
I wonder what the legality of that is.
I don’t know about the legality but Carmack himself was pretty happy the article was saved: https://twitter.com/ID_AA_Carmack/status/1156639168002428931
What about an extra slash after the item type?
Gopher URLs have a specific format, where the first character is the item type and isn’t transmitted. The part after that is the selector.
My Gopher URL encodes the following selector, which is the correct one:
Your Gopher URL encodes the following selector, which is incorrect; some servers do have selectors beginning this way, but mine doesn’t:
My Gopher server is currently too lenient on what it accepts, but I expect to change this, so selectors that begin with ‘’/’’ and requests that, say, use Line Feed instead of Carriage Return and Line Feed will stop working at that point.
We can see it here (at
2.1. Gopher URL Syntax)
A Gopher URL takes the form: gopher://<host>:<port>/<gopher-path> where <gopher-path> is one of: <gophertype><selector> <gophertype><selector>%09<search> <gophertype><selector>%09<search>%09<gopher+_string>
: I suppose this is the gopher server? http://verisimilitudes.net/gophershell
Simple and efficient!
I would love to read the implementation but I think there’s some encoding problem. None of the APL symbols are rendered properly for me. For example I see
â where I would expect ⍝.
The link is written in this way:
<a charset="UTF-8" href="masturbation.apl">implementation</a>
This didn’t correct it, however. Your browser probably gives you the option to change the character encoding of a document manually, but I’ll change the link to behave properly if it’s a matter of changing this tag.
Your server isn’t sending a content type or encoding header with the page itself. The
charset attribute on the anchor isn’t supported by any browser. I don’t know of a way to change the encoding client-side in mobile Safari, but you are right it can be changed in most desktop browsers.
As @spc476 said, the best way to correct it is to configure Apache to deliver files with
.apl extension with a
Content-type: text/plain; charset=utf-8 header.
Another way to fix it with Apache is to add a AddDefaultCharset directive to the configuration.
I find RISC misguided. The RISC design was created because C compilers were stupid and couldn’t take advantage of complex instructions, and so a stupid machine was created. The canonical RISC, MIPS, is ugly and wasteful, with its branch-delay slots and large instructions that do little.
RISC, no different than UNIX, claims simplicity and small size, but accomplishes neither and is worse than some ostensibly more complex designs.
This isn’t to write all RISC designs are bad; SuperH is nice from what I’ve seen, having many neat addressing modes; the NEC V850 is also interesting with its wider variety of instruction types and bitstring instructions; RISC-V is certainly a better RISC than many, but that still doesn’t save it from the rather fundamental failings of RISC, such as its arithmetic instructions designed for C.
I think, rather than make machines designed for executing C, the ’‘future of computing’’ is going to be in specialized machines that have more complex instructions, so CISC. I’ve read of IBM mainframes that have components dedicated to accepting and parsing XML to pass the machine representation to a more general component; if you had garbage collection, bounds checking, and type checking in hardware, you’d have fewer and smaller instructions that achieved just as much.
The Mill architecture is neat, from what I’ve seen, but I’ve not seen much and don’t want to watch a Youtube video just to find out. So, I’m rather staunchly committed to CISC and extremely focused CISC as the future of things, but we’ll see. Maybe the future really is making faster PDP-11s forever, as awful as that seems.
“The RISC design was created because C compilers were stupid and couldn’t take advantage of complex instructions”
No. Try Hennesy/Patterson Computer architecure for a detailed explanation of the design approach. I
This isn’t a bad article, as I like how it suggest restructuring to avoid the issue, but I feel this is still a much better analysis of the particular issue.
I really didn’t like the article you linked when I read it some time ago. I don’t see where it analyzes the issue like you say, and the advantage it advertises can be gained just by linting the C code and forcing braces. Extra braces are a compile error in C just like an extra
end if is in Ada. I really like Ada, but this has to be the worst reason I ever read to use it.
Firstly, I try to avoid having accounts at all. This Lobsters account had been the first one I’d made in several years, I believe. Accounts that I’d made in the past and regard as useless I’ve deleted rather to the average of my ability, not that there were many to start with. You probably don’t have a list of accounts you could go through and manage, do you?
Anyway, I host my own email on my own domain, but this isn’t what I use AFK. I’m a Free Software Foundation member and so I give others, including businesses, my email@example.com email address and this has served me well, as I can very easily have it point to any address, which allowed me to seamlessly transfer it from one email provider to my self-hosting without any issue.
So, my advice boils down to have an email address you can point to any other email address and start using that one, in brief.
Genuinely curious: You claim to have very few accounts. Do you just not use online services/sites?
Do you just not use online services/sites?
That’s exactly what I do. In general, I won’t use something if it requires me to make an account. I have a list of online accounts I have that are still active, including government accounts and whatnot, and it’s roughly ten or so, most of which simply haven’t been killed yet.
So, using domain names as if they’re Twitter hashtags is continuing, I see.
Again, there’s not a mention of Gmail and its malicious ways here.
After this individual stepped in to provide this “service”, the ability to contribute to the mailing lists using e-mail with HTML ceased.
Anyway, I find it amusing this article is written without HTML, to counter that other one. Say, perhaps I should register a domain and participate in this dumbassery. It looks like an easy way to get attention, since everyone loves to share their opinions and personal preferences and it’s just technical enough for people to feel smart while discussing it. No, I won’t.
So, using domain names as if they’re Twitter hashtags is continuing, I see.
It’s way easier to remember “stop-gatekeeping.email” than “www.example.com/doc/wp/2019-07-24/stop-gatekeeping-email-6655321.html”.
Say, perhaps I should register a domain and participate in this dumbassery. It looks like an easy way to get attention, since everyone loves to share their opinions and personal preferences and it’s just technical enough for people to feel smart while discussing it.
Perhaps I should write an article distilling my ideas concerning this.
A good programmer is likely going to be a hacker (not a cracker). A hacker is someone with a nice sense of aesthetics and creativity, as two qualities. A hacker is going to implement software he designed himself, likely for his own use. You can be a good programmer, just writing the same things others do, with libraries others wrote, and other such things, but I’d be inclined to classify that as average, rather than good, which I’m considering above average.
A hacker is probably going to design and implement libraries for his own purposes, rather than reuse something someone else wrote, but this is debatable. A hacker should have a genuine interest in the topic, so a hacker is likely going to know a wide variety of languages, learning more as mastery of one is achieved. I’d be inclined to argue a hacker will work with languages such as Lisp, APL, and machine codes more than Go, C++, and Java; note that the former group is filled with variety, whereas the latter group is roughly the same language.
You can be a hacker in isolation, but a hacker is likely going to have some manner of home group of sorts. A hacker probably spends all or most errant thought mulling over the topics of interest. If you’re a programmer just for your job and you don’t think of it much or at all outside, then you can be an average programmer, but not a hacker.
Tying in with the creativity and whatnot mentioned earlier, a hacker is going to create novel things and be interested with potentially obscure things. I’m a hacker and I have a tiny little esoteric language and work with CHIP-8 a good amount; I work on a novel machine code development tool I’ve designed. This isn’t boasting, but merely examples.
As you can guess, I’ve described a good bit of myself in this message. I won’t claim to have distilled the essence of being a hacker in this message and probably not in any articles I write about it, either, but I do believe this is at least a good general idea. If you’ve not done so, you could bother RMS with this question. That’s all.
This is neat. Also, I found it amusing that this Python program is mostly Fortran, if I understand what Numpy is.
This was a good intro into NumPy,
It’s neat to see more small software under the GPLv3, a good license. It’s easily audited and not liable to be locked away. That it’s rather finished is also good; more software should strive to be finished.
I suppose it was silly to expect interactive debugging such as Common Lisp provides.
But intercepting a live program in production for debugging purposes is not feasible because it would impact the production system.
This is false, as my expectation implies.
To understand a program at any exact moment, we use core dumps.
Core dumps don’t contain the entire state of a program.
If a process is crashing consistently without enough diagnostics, it might be hard to pinpoint to the problem without conventionally debugging the program.
Programs shouldn’t really ’‘crash’’ anyway.
The difficulty of reproducing production environments lead to the fact it is hard to reproduce the execution flows that are taking place in production.
This is all the more reason to support interactive debugging, which is low-cost.
It is primarily used for post-mortem debugging of a program
To paraphrase ’‘The UNIX-Hater’s Handbook’’, this is like medicine as autopsy.
Core dumps don’t contain the entire state of a program.
What state that you need for debugging is missing from a core file?
Current offset for open files possibly. Actually lots of file info that’s typically stored kernel side. Remote address of sockets.
I have a file of jokes of mine and this one is appropriate.
With regards to Linux kernel development practices:
The Linux kernel, due to C, is a monument to convention.
Absolutely none of this nonsense is necessary; it’s anti-knowledge. Imagine what operating systems would be like if people weren’t determined to use garbage from the 1970s to reimplement garbage from the 1970s.
Windows NT and commodity clouds weren’t vaporware. Even Midori got built and deployed in the field. It turned vaporware in general market cuz Microsoft wanted nothing threatening their cash cow.
True enough, nobody ever said Windows NT was better, faster, and more provably correct. But it was written in C/C++ so that probably explains both that it works and its not so good?
People definitely said the user experience was better than DOS/UNIX, I don’t know if it was faster (or resource efficient) unless you’re comparing old Windows to modern Linux, and Shapiro wrote the definitive piece on its [in-]security certification. He had some vaporware himself in that which happened due to a mix of market reality for high-security OS’s and… Microsoft hiring him. Oh the irony.
Then again, I usually think of MS Research and MS Operations (Windows etc) like different groups. MSR hired him. They do great work. MSO fucks up theirs to maximize numbers in annual reports. His essay called out MSO before being hired by MSR. So, maybe no irony even though “Microsoft” is involved in both situations.
It turned vaporware in general market cuz Microsoft wanted nothing threatening their cash cow.
Is there a reference for this being the reason? Midori was super interesting and I find it hard to find info on it outside the blog post series.
I can’t remember if I have a source. This might be an educated guess. Microsoft has done everything they can with marketing, their legal team, and illegal deals to make Windows desktop and Windows Mobile (at one point) succeed against everything else. They tried lawsuits against Linux users. They pulled billions in patent royalties from Android suppliers. They’ll do anything to protect their revenues or increase them.
Most of their profits in Windows ecosystem come from businesses that are locked in to legacy code that runs on Windows and/or consumers that want to run Windows apps. Their other software, which they cross-sell, is built around Windows. Any success of Midori threatens that with unknown positives. A turf war between the group wanting to maintain Windows and a group pushing Midori is almost certainly resulting in Midori losing.
Further, they’d have to port their existing offerings to Midori to keep the cross-sells. Vista already showed how much they sucked at pulling off such a transition. That adds major risk to a Midori port. So, final decision by business people will be Windows is huge asset, Midori is minor asset with high liability, and they should just back Windows/.NET porting advances of Midori into it.
That’s my educated guess based on both their long-term behavior and the fact that I can’t buy Midori.
Thanks for the response! I’m aware of the company’s history and can of course see how one can project forward to that conclusion, I just wanted to know if there was anything solid written about why the project came apart.
I know the developers kept leaving. That’s usually a bad sign. Then, Duffy wrote this cryptic message on his blog:
“As with all big corporations, decisions around the destiny of Midori’s core technology weren’t entirely technology-driven, and sadly, not even entirely business-driven.”
Usually a sign management is being incompetent, scheming, or both.
Metadata leaking such as this is one reason I tend to avoid systems I don’t fully understand. I avoid using these common version control systems, because I don’t know precisely what manner of metadata each one leaks. I don’t use one at all, although if I did I’d probably scrub this metadata out first.
Of course, I doubt anyone would be interested in such metadata of mine, anyway.
The time is part of a git commit, so if you publish a git repo, the time will be in there. That being said, you can easily randomize it, if you want. People do that to get a funny history on their github page..
Firstly, I’m not supporting the idea that tests such as this should be used, as that’s clearly asinine.
However, I want to address the idea that every time these programs derive a result that goes against the narrative, they’re ’‘biased’’. This is mostly with regards to @icefox and others.
People have tried to do similar things in the US, and all they ended up doing is training their NN’s to detect black people
Why is that inherently a wrong result? Explain to me why you believe that, if a criminality inference system disproportionately detects criminals of a certain race, that it’s inherently wrong.
Amazon’s(?) attempt at training a NN to select good potential hires which always advised that they choose white males to be their engineers.
Again, why is this an inherently wrong result? Would you feel the machine was wrong if it suggested a diverse cast of people, instead, or would you use that as evidence that diversity is good? What makes this system wrong?
Could it be that there are physical characteristics that detect criminals, good employees, etc. and people merely want to believe this isn’t true?
“Criminal activity” is a social construct, not a biological one.
Homosexuality used to be criminal in many jurisdictions. It still is in some. So a system that can implement a “gaydar” by scanning the faces of people and determining their sexual orientation would classify some gay people as criminals in one jurisdiction, and not criminals in others.
The legalization of marijuana has gained popularity in many states and countries lately. A hypothetical program that could pick out potheads from their physical characteristics would classify them as criminals then, and presumably as criminals now - even though they are no longer criminals in the eyes of law.
Criminality is not a measure of good behavior. Frequently innocent people become “criminals” through a biased system. Any attempts to train a neural net to find criminals based on existing methods will only reflect the biases in our existing methods.
tl;dr I don’t automatically reject such outcomes but I scrutinize them more because they’re typically weak results riding a hype wave in my experience.
My experience and study have led me to believe this is very unlikely. I therefore expect very strong evidence that there’s a predictive relationship, because I have a long history of observing evidence to the contrary.
Often the claim being made isn’t rooted in objective fact but in social constructs. gerikson has a great example - criminality is a social construct so it’s rooted just as strongly in social definitions of crime as it is in the criminal’s qualities.
Additionally, when I see these types of claims being made, they’re typically made in an intellectually dishonest way, even if that’s not the original author’s intent. For instance, poor control for confounding variables. It’s true that plenty of true ideas are argued for badly, but when I see results that are consistently arrived at through motivated reasoning, then all else being equal I tend to be more skeptical of them than of ideas I don’t have a prior opinion of. This might not be logically sound but empirically it has served me well.
I’m squarely in that group that simply avoids using Ncurses. I find Ncurses to be a baroque API, from what I know, and as with many UNIX APIs it seems so often I learn of some new quality and can’t tell if it’s a joke or not, at first.
My advice is to simply use ECMA-48; every terminal emulator you’re likely to come across supports a subset and it’s easy to use, being free from ’‘color pairs’’ and other nonsense. The only issue is sending different sequences to terminals that don’t support the subset being used, or finding a common subset, but there is a point where this ceases to be reasonable and one adopts a ’‘If you’re terminal doesn’t even support this, I don’t care if it works.’’ attitude.
Writing a library in most any language that sends ECMA-48 codes is simple, so you can work in whichever language you’re actually using instead of binding with a decrepit C library. It’s also important to stress that people only really use Ncurses and whatnot still because the graphical APIs of UNIX are far worse to cope with.
Now, to @Hales :
If one digs deep enough into the history of computing, one lears that what’s modern is distinctly worse than prior systems. There were easy graphical systems on computers decades ago, but UNIX is still stuck sending what it thinks are characters to what it thinks is a typewriter and using X11 lets you pick your poison between working with a directly (Wayland is coming any year now, right?) or using a gargantuan library that abstracts over everything. It’s a mess, I think, don’t you agree?
Also, @Hales , you’re note on UTF-8 reminded me of something I found amusing from the Suckless mailing list, when I subscribed to it trying to get proper parsing of ISO 8613-6 extended color codes into st. Simply put, UTF-8 is trivially transparent in the same way color codes are transparent, in that there are many cases where invariants are violated and programs misbehave, commonly expressed as misrendering. There was an issue with the
colsprogram, that shining example of the UNIX philosophy: it didn’t properly handle colors in columnating its output; to properly handle this, the tool would need to be made aware of color codes and how to understand them, as it would otherwise assume each would render as a single character would, ruining the output; the solution, according to one in the mailing list, was that colors are bloat and you don’t need them.
Don’t you agree that’s amusing?
I agree with you. 100%. HOWEVER. Ncurses comes with pretty much every mainstream unix, making it a very attractive target for applications. I think that ncurses’ limitations are holding back innovation of TUI interfaces. If every unix comes with an adequate TUI widgeting library, this encourages quality TUI widget-based systems.
Yes. The entire concept of a TTY is a mess. I am going to work on replacing that. Other people are working on replacing that. (The solution, obviously, is to make the code that generates the text and the code that displays the text work on the same level of abstraction.) That doesn’t change the fact that we’re stuck with ttys for the time being; is it wrong to try to improve our experience with them until we can really leave them?
The stuff we’ve kept from the dawn of computing is basically all either ‘simple, sensible interfaces’, or ‘backwards compatible monstrosity’. Those are, after all, the two reasons to keep something - either because it’s useful as a building block, or because it’s part of something useful.
Have you looked at the TUI client API for Arcan? and the way we deal with colour specifically? If not, in short:
Do you allow the client to set arbitrary 24bit colours to the grid?
Does the TUI API work with text-based but not-fixed-width interfaces (e.g. emacs, mlterm)?
Thank you for posting, I hadn’t heard about arcan until today but have just read a chunk of your blog with interest :)
colors: both arbitrary fg/bg (r8g8b8) and semantic labels to resolve (such as “give me fg/bg pair for alert text”) shaped text: yes (being reworked at the moment to account for server- side rendering), but ligatures, shaped is there as a ‘per-window’ attribute for now, testing out per line.
Thanks for the reply.
I think I would want to be able to use different fonts (or variations on a font) for different syntax highlighting groups in my editor. This looks quite nice in emacs and in code listings in latex. Perhaps you consider this to be an advanced use where the application should just handle their own frame buffer, though.
While I have your ear, what’s the latency like in the durden terminal and is minimising latency through the arcan system a priority?
in principle, multiple fonts (even per line) is possible, and that’s how emojii works now, one primary font for the normal glyphs and when there is a miss on lookup, a secondary is used. There is an artificial limit, that’ll loosen over time. Right now, the TUI client is actually handling its own framebuffer, and we are trying to move away from that, which can be seen in the last round of commits. The difficulty comes from shaped rendering of ligatures, where both sides need to agree on the fonts and transformations used; doing it wrong creates round-trips (no-no over the network), as the mouse-selection coordinate translation needs to know where the cursor position actually became.
Most of this work is towards latency reduction, removal of terminal protocols fixes synchronization, moving rendering server-side allows direct-to-scanout buffer racing-the-beam rasterization, or at least, entirely on-gpu for non-fullscreen cases.
When you say “both sides” do you mean the client on e.g a remote server and a “TUI text packing buffer” renderer on e.g. a laptop?
Sounds like you could just send the fonts (or just the sizes of glyphs and ligature rules) to the server for each font you intend to use and be done with no round trips. Then you just run the same version of harfbuzz or whatever on each side and you should get the same result. And obviously the server can cache the font information so you’re not sending it often (though I imagine just the sizes and ligatures could be encoded real small for most fonts).
Do you have any plan about what to do about the RTT between me pressing a key on my laptop, that key going through arcan, the network, remote arcan and eventually into e.g. terminal vim and then me getting the response back? I feel like mosh’s algorithm where they make local predictions about what keypresses will be shown is a good idea.
Sounds exciting! I don’t know what you mean by “moving rendering server-side”, though. Is the server here the arcan server on my laptop? And moving the rendering into arcan means you can do the rendering more efficiently?
Is arcan expected to offer an advantage in performance in the windowed case compared to e.g. alacritty on X11? Or is the benefit more just that anything that uses TUI will be GPU accelerated transparently whereas that’s more of a pain in X11?
Right now (=master) I am tracking the fonts being sent to the client on the server side, so both sides can calculate kerning options and width, figure out sequence to font-glyph id etc. The downsides are two: 1. the increased wind-up bandwidth requirement when you use the arcan-net proxy for network transparency, 2. the client dependency on freetype/harfbuzz.
My first plan for the RTT is type-ahead (local echo in ye olde terminal speak), implemented on the WM level (anchored to the last known cursor position, etc.) so that it can be enabled for other uses as well, such as input-binning/padding for known-networked windows where side channel analysis (1 key, 1 packet kind of a deal) is a risk.
Both performance and memory gains. Since the actual drawing is being deferred to the composition stage, windows that are partially occluded or clipped against the screen would only have its visible area actually being processed - while alacritty has to both render into an offscreen buffer (that is double buffered) that then may get composed. So whereas alacritty would have to pay for (glyph atlas texture, vertex buffer, front-buffer, back-buffer) on a per pixel basis in every stage, the cost here will only be the shared atlas for all clients (gpu mem + cache benefits), rest would be a ~12b / cell + vertex buffer.
It has been here for a while. GNOME and KDE support it natively, a few popular distros ship with Wayland enabled by default. Firefox is a native wayland app. What makes you think it’s “any year now” ?
Colours are not the only invisible control codes that I’d expect cols to have to handle. Alas I can’t see a “simple” solution to this. You pretty much have three options:
Of the crappy options I can see: #1 does seem the most like something suckless devs would like.
Suckless makes some great stuff, but some of their projects are a bit too minimal for me. Take for example their terminal emulator st:
I love my scrollwheel, whether it’s a real one or two-fingers on my cheap little laptop’s touchpad. That and being able to use Shift+PageUp/Down. I guess everyone draws the line somewhere differently, and I have more things about my current term (urxvt) that I could moan about.
I don’t have any raw X11 experience. I’ve primarily used SDL, which yes indeed abstracts that away for me.
I’m not completely convinced that wayland is going to be the answer: from everything I read it seems to be solving some problems but creating entirely new ones.
From a user perspective however: Xorg is wonderful. It just works. You have lots of choice for DE, WM, compositor (with and without effects), etc. The open source graphics drivers all seem to have a TearFree option, which seems to work really well. I’d love to see latency tested & trimmed, but apart from that the idea of having to change a major piece of software I use scares me. I don’t want to give up my nice stack for something that people tell me is more better (or more “modern”).
You forgot one:
Interpreting ECMA-48 for this isn’t that bad, I have Lua code that does such . And a half-assed approach would be to assume all of C0 (0-31, 127)  as 0-width, all of C1 (127-169) as 0-width, with special handling for CSI (155) and ESC (27). For ESC, just suck up the next character (except for ‘[’) and for CSI (155 or ESC followed by a ‘[’) just keep sucking up characters until a character from 64 (@) to 126 (~) is read, and that will catch most of ECMA-48 (at least, the parts that are used most often).
 It uses LPEG. I also have a version that’s specific to UTF-8  but this one is a bit easier to understand in my opinion.
 127 isn’t technically in C0 or C1, but I just lump it in with C0.
Flying on a modern airliner is also distinctly worse than passenger jet travel in the 1950’s. But I can still afford to go anywhere in the world in a day.
Worse in experience? I am almost certainly safer flying on a modern passenger jet now than in the 1950s.
Enough said. :-P
(I am aware this is not a conclusive statement.)
Heh, you can still get that (actually, much better) on international business class. Hmm, I’m curious if the cost is comparable between 1950’s tickets like those in your picture (inflation adjusted) and international business class today…