I suspect this is why on BeOS every graphical application had two threads automatically, an “application” thread and a “display” thread. I think you were expected to do nothing from the UI except pass messages to the application thread. But I was a very poor programmer when I read about all this, so I could be mistaken. I would expect that if you had a realtime guarantee on such message passes, you would get a realtime application as a result—although this assumes certain things about your message queue which I don’t remember anything about at all.
I like to say that XML is something you inflict on others and not yourself. By this I mean, it is very good for interchange when you need all parties to agree on what constitutes a valid document. YAML and JSON are not great for this, although we use JSON for it a lot in practice for the same reason we use Python and not Haskell.
I have also deployed it against myself for odd situations where I have a weird superset of information from several systems. For instance, we were migrating a database from one schema to another for compatibility reasons. I wanted to track what the source tables/columns were and what the targets were, generate documentation about both the new schema and the mapping from the old to the new and generate automated migrations to create the new schema. I made a little XML file with a trivial format for this and wrote three XSL stylesheets to generate the outputs. XML is an interesting tool. Not perfect for every scenario but it comes in handy sometimes.
for odd situations where I have a weird superset of information from several systems
Which is what we have in almost every microservice-based architecture when responses from multiple services have to be combined. We could go one step further and say that a graph data format like Turtle should be used instead of a tree-based one like XML/JSON to replace a non-trivial n-way tree merging with a straightforward graph union.
it is very good for interchange when you need all parties to agree
Seems like exactly what we should do when designing microservice architectures.
I like AppleScript but I think calling it “easy to use” is a bit of a stretch. Each application could define their own syntax. In 2005, I had to write a bit of AppleScript to help install a mail proxy for a commercial spam filter; each mail client that supported AppleScript had a completely different dictionary for doing the same stuff. It was not always very obvious at all what syntax was the right one to use; you would click compile and then have a hard time figuring out what you had done wrong. There was also no real debugging. You just ran it and if it did what you wanted, great. Since even then AppleScript wasn’t widely used, applications would have bugs, or bits of the dictionary didn’t work, or there would be no obvious way of creating new items (accounts or whatever the app was concerned with).
I was also befuddled when Automator came out, but since I’m a programmer I never was really the target audience for these kinds of things. I had never even heard of Shortcuts before reading this article. It looks pretty neat, but again, no real way to debug things (I made a two-step shortcut that just said “Shortcut failed to execute”). I am glad that they haven’t given up on these technologies but I wonder what is so much better about Shortcuts over Automator that caused them to deprecate one and buy the other.
AppleScript is the only read-only language I’ve ever encountered, the opposite of Perl. I’ve never come across AppleScript written by other people that I found hard to understand, even before I learned any of the language. In contrast, writing a new AppleScript has always been a struggle for me.
The additional OTP token shared over the call was critical, because it allowed the attacker to add their own personal device to the employee’s Okta account, which allowed them to produce their own Okta MFA from that point forward. This enabled them to have an active GSuite session on that device.
So, it seems that Okta and GSuite were linked up here, perhaps by policy, so that having the Okta MFA token gave the attacker this user’s GSuite, which then gave them all of the OTPs along with everything else in Google Authenticator. The corporate GSuite account, in other words, had all of the corporate passwords in it, so all the attacker needed to get everything was to get in between one user’s Okta and their GSuite.
I sense that the neighboring IT professionals are probably torn between wanting to force a particular password service on their users (to prevent them from doing dumb shit like using Lastpass) and not wanting their entire class of users using systems they don’t really understand which might enable this kind of attack. And of course, not wanting them to use post-it notes either.
So, it seems that Okta and GSuite were linked up here, perhaps by policy
Okta is an Identity Provider. The entire point of the product is linking identities across systems. There are very good reasons for doing this.
Without an IdP, employees have to manage their own passwords across many systems. This makes taking an actual inventory during the offboarding process a nightmare. Active identities may be left lingering for years after someone leaves.
An IdP also gives you the ability to enforce policies like using MFA. If the employee manages their own identity, they can choose not to.
The real security hole in this scenario, IMO, is Google Authenticator. Since TOTP codes are simply a shared secret value, you’re essentially passing a plaintext password around. Once a TOTP secret is established, it should never be shared to any other system.
I always asked myself, ever since i got introduced to prolog at the early stages of my university module theoretical computer science and abstract datatypes - what would i use prolog for and why would i use it for that?
I know two use cases where prolog is used earnestly, both deprecated these days:
Gerrit Code Review allowed creating new criteria a change must fulfill before it can be submitted. Examples
SPARK, an Ada dialect, used prolog up to SPARK2005 (paper). Out of the formal verification annotations and the Ada code it created prolog facts and rules. With those, certain queries passed if (and only if) the requirements encoded in those annotations were satisfied. They since moved to third party SAT solvers, which allowed them to increase the subset of Ada that could be verified (at the cost of being probabilistic, given that SAT is NP-complete: a true statement might not be verified successfully, but a false statement never passes as true), so prolog is gone.
Datalog, which is essentially a restricted Prolog, has made a bit of a resurgence in program verification. There are some new engines like Soufflé designed specifically by/for that community. Not an example of Prolog per se, but related to your point 2.
A way I’ve been thinking about it is what if my database was more powerful & less boilerplate.
A big one for me is why can’t I extend a table with a view in the same way prolog can share the same name for a fact & a predicate/rule. Querying prolog doesn’t care about if what im querying comes from a fact (table) or a predicate (view).
This in practice i think would enable a lot of apps to move application logic into the database, I think this is a great thing.
move application logic into the database, I think this is a great thing
The industry as a whole disagrees with this vehemently. I’m not sure if you were around for the early days of RDBMS stored procedure hell, but there’s a reason they’re used fairly infrequently.
We actually do stored procedures at work & test them via rspec but it sucks. Versioning them also sucks to deal with. And the language is terrible from most perspectives, i think primarily it sucks going to a LSP-less experience.
I think to me though the root suckiness is trying to put a procedural language side by side a declarative one.
This wasn’t what I was saying with views.
I do think databases could be more debuggable & prolog helps here because you can actually debug your queries with breakpoints and everything. Wish i could do that with sql.
EDIT: but we continue to use stored procedures (and expand on them) because its just so much faster performance-wise than doing it in rails, and I don’t think any server language could compete with doing analysis right where the data lives.
Stored procedures can absolutely be the correct approach for performance critical things (network traversal is sometimes too much), but it also really depends. It’s harder to scale a database horizontally, and every stored procedure eats CPU cycles and RAM on your DB host.
I agree, prolog != SQL and can be really nice which may address many of the issues with traditional RDBMS stored procedures.
I do think databases could be more debuggable & prolog helps here because you can actually debug your queries with breakpoints and everything. Wish i could do that with sql.
Yeah. DBs typically have pretty horrible debugging experiences, sadly.
I feel that that this is a very US-coastal point of view, like one that is common at coastal start-ups and FAANG companies but not as common elsewhere. I agree with it for the most part, but I suspect there are lots of boring enterprise companies, hospitals, and universities, running SQL Server / mostly on Windows or Oracle stacks that use the stored procedure hell pattern. I would venture that most companies that have a job title called “DBA” use this to some extent. In any case I think it’s far from the industry as a whole
Nah, I started my career out at a teleco in the Midwest, this is not a SV-centric opinion, those companies just have shit practices. Stored procedures are fine in moderation and in the right place, but pushing more of your application into the DB is very widely considered an anti-pattern and has been for at least a decade.
To be clear, I’m not saying using stored procedures at all is bad, the issue is implementing stuff that’s really data-centric application logic in your database is not great. To be fair to GP, they were talking about addressing some of the things that make approaching thing that way suck
The industry as a whole disagrees with this vehemently. I’m not sure if you were around for the early days of RDBMS stored procedure hell, but there’s a reason they’re used fairly infrequently.
… in the same way prolog can share the same name for a fact & a predicate/rule. Querying prolog doesn’t care about if what im querying comes from a fact (table) or a predicate (view).
somewhat narrowly.
Sure we do not want stored procs, but
moving Query complexity to a database (whether it is an in-process-embedded database, or external database) is a good thing.
Queries should not be implemented manually using some form of a ‘fluent’ APIs written by hand. This is like writing assembler by hand, when optimizing compilers exists and work correctly.
These kinds of query-by-hand implementations within an app, often lack global optimization opportunities (for both query and data storage).
If these by-hand implementations do include global optimizations for space and time - then they are complex, and require maintenance by specialized engineers (and that increases overall engineering costs, and may make existing system more brittle than needed).
Also, we should be using in-process databases if the data is rather static, and does not need to be distributed to other processes (this is well served by embedding prolog)
Finally, prolog-based query also includes defining ‘fitment tests’ declaratively. Then prolog query responds finding existing data items that ‘fits’ the particular fitment tests. And that’s a very valuable type of query for applications that need to check for ‘existence’ of data satisfying a set of, often complex, criteria.
Databases can also be more difficult to scale horizontally. It can also be more expensive if you’re paying to license the database software (which is relatively common). I once had the brilliant idea to implement an API as an in-process extension to the DB we were using. It was elegant, but the performance was “meh” under load, and scaling was more difficult since the whole DB had to be distributed.
I have a slightly different question: does anybody use prolog for personal computing or scripts? I like learning languages which I can spin up to calculate something or do a 20 line script. Raku, J, and Frink are in this category for me, all as different kinds of “supercalculators”. Are there one-off things that are really easy in Prolog?
I’d say anything that solves “problems” like Sudoku or these logic puzzles I don’t know the name of “Amy lives in the red house, Peter lives next to Grace, Grace is amy’s grandma, the green house is on the left, who killed the mayor?” (OK, I made the last one up).
When I planned my wedding I briefly thought about writing some Prolog to give me a list of who should sit at which table (i.e. here’s a group of 3, a group of 5, a group of 7 and the table sizes are X,Y,Z), but in the end I did it with a piece of paper and bruteforcing by hand.
I think it would work well for class schedules, I remember one teacher at my high school had a huge whiteboard with magnets and rumour was he locked himself in for a week before each school year and crafted the schedules alone :P
The “classical” examples in my Prolog course at uni were mostly genealogy and word stems (this was in computer linguistics), but I’m not sure if that would still make sense 20y later (I had a feeling in this particular course they were a bit behind the time even in the early 00s).
I’d be interested to see a comparison like this. I don’t really know z3, but my impression is that you typically call it as a library from a more general-purpose language like Python. So I imagine you have to be aware of how there are two separate languages: z3 values are different than Python native values, and some operations like if/and/or are inappropriate to use on z3 values because they’re not fully overloadable. (Maybe similar to this style of query builder.)
By contrast, the CLP(Z) solver in Prolog feels very native. You can write some code thinking “this is a function on concrete numbers”, and use all the normal control-flow features like conditionals, or maplist. You’re thinking about numbers, not logic variables. But then it works seamlessly when you ask questions like “for which inputs is the output zero?”.
It’s really good for parsing thanks to backtracking. When you have configuration and need to check constraints on it, logic programming is the right tool. Much of classical AI is searching state spaces, and Prolog is truly excellent for that. Plus Prolog’s predicates are symmetric as opposed to functions, which are one way, so you can run them backwards to generate examples (though SMT solvers are probably a better choice for that today).
That subjectively resembles parser combinator libraries. I guess if you parse with a general-purpose language, even if the structure of your program resembles the structure of your sentences, you give up on getting anything for free; it’s impossible for a machine to say “why” an arbitrary program failed to give the result you wanted.
You can insert cuts to prevent backtracking past a certain point and keep a list of the longest successful parse to get some error information, but getting information about why the parse failed is hard.
I have used it to prototype solutions when writing code for things that don’t do a lot of I/O. I have a bunch of things and I want a bunch of other things but I’m unsure of how to go from one to the other.
In those situations it’s sometimes surprisingly easy to write the intermediary transformations in Prolog and once that works figure out “how it did it” so it can be implemented in another language.
Porting the solution to another language often takes multiple times longer than the initial Prolog implementation – so it is really powerful.
You could use it to define permissions. Imagine you have a web app with all kinds of rules like:
students can see their own grades in all courses
instructors and TAs can see all students’ grades in that course
people can’t grade each other in the same semester (or form grading cycles)
You can write down each rule once as a Prolog rule, and then query it in different ways:
What grades can Alice see?
Who can see Bob’s grade for course 123, Fall 2020?
Like a database, it will use a different execution strategy depending on the query. And also like a database, you can separately create indexes or provide hints, without changing the business logic.
For a real-world example, the Yarn package manager uses Tau Prolog–I think to let package authors define which version combinations are allowed.
When you have an appreciable level of strength with Prolog, you will find it to be a nice language for modeling problems and thinking about potential solutions. Because it lets you express ideas in a very high level, “I don’t really care how you make this happen but just do it” way, you can spend more of your time thinking about the nature of the model.
There are probably other systems that are even better at this (Alloy, for instance) but Prolog has the benefit of being extremely simple. Most of the difficulty with Prolog is in understanding this.
That hasn’t been my experience (I have written a non-trivial amount of Prolog, but not for a long time). Everything I’ve written in Prolog beyond toy examples has required me to understand how SLD derivation works and structure my code (often with red cuts) to ensure that SLD derivation reaches my goal.
This is part of the reason that Z3 is now my go-to tool for the kinds of problems where I used to use Prolog. It will use a bunch of heuristics to find a solution and has a tactics interface that lets my guide its exploration if that fails.
I don’t want to denigrate you, but in my experience, the appearance of red cuts indicates deeper problems with the model.
I’m really curious if you can point me to a largish Prolog codebase that doesn’t use red cuts. I always considered them unavoidable (which is why they’re usually introduced so early in teaching Prolog). Anything that needs a breadth-first traversal, which (in my somewhat limited experience) tends to be most things that aren’t simple data models, requires red cuts.
Unfortunately, I can’t point you to a largish Prolog codebase at all, let alone one that meets certain criteria. However, I would encourage you to follow up on this idea at https://swi-prolog.discourse.group/ since someone there may be able to present a more subtle and informed viewpoint than I can on this subject.
I will point out that the tutorial under discussion, The Power of Prolog, has almost nothing to say about cuts; searching, I only found any mention of red cuts on this page: https://www.metalevel.at/prolog/fun, where Markus is basically arguing against using them.
Because it lets you express ideas in a very high level, “I don’t really care how you make this happen but just do it” way, you can spend more of your time thinking about the nature of the model.
So when does this happen? I’ve tried to learn Prolog a few times and I guess I always managed to pick problems which Prolog’s solver sucks at solving. And figuring out how to trick Prolog’s backtracking into behaving like a better algorithm is beyond me. I think the last attempt involved some silly logic puzzle that was really easy to solve on paper; my Prolog solution took so long to run that I wrote and ran a bruteforce search over the input space in Python in the time, and gave up on the Prolog. I can’t find my code or remember what the puzzle was, annoyingly.
I am skeptical, generally, because in my view the set of search problems that are canonically solved with unguided backtracking is basically just the set of unsolved search problems. But I’d be very happy to see some satisfying examples of Prolog delivering on the “I don’t really care how you make this happen” thing.
How is that strange? It verifies that the bytecode in a function is safe to run and won’t underflow or overflow the stack or do other illegal things.
This was very important for the first use case of Java, namely untrusted applets downloaded and run in a browser. It’s still pretty advanced compared to the way JavaScript is loaded today.
I mean I can’t know from the description that it’s definitely wrong, but it sure sounds weird. Taking it away would obviously be bad, but that just moves the weirdness: why is it necessary? “Give the attacker a bunch of dangerous primitives and then check to make sure they don’t abuse them” seems like a bad idea to me. Sort of the opposite of “parse, don’t verify”.
Presumably JVMs as originally conceived verified the bytecode coming in and then blindly executed it with a VM in C or C++. Do they still work that way? I can see why the verifier would make sense in that world, although I’m still not convinced it’s a good design.
You can download a random class file from the internet and load it dynamically and have it linked together with your existing code. You somehow have to make sure it is actually type safe, and there are also in-method requirements that have to be followed (that also be type safe, plus you can’t just do pop pop pop on an empty stack). It is definitely a good design because if you prove it beforehand, then you don’t have to add runtime checks for these things.
And, depending on what you mean by “do they still work that way”, yeah, there is still byte code verification on class load, though it may be disabled for some part of the standard library by default in an upcoming release, from what I heard. You can also manually disable it if you want, but it is not recommended. But the most often ran code will execute as native machine code, so there the JIT compiler is responsible for outputting correct code.
As for the prolog part, I was wrong, it is only used in the specification, not for the actual implementation.
You can download a random class file from the internet and load it dynamically and have it linked together with your existing code. You somehow have to make sure it is actually type safe, and there are also in-method requirements that have to be followed (that also be type safe, plus you can’t just do pop pop pop on an empty stack). It is definitely a good design because if you prove it beforehand, then you don’t have to add runtime checks for these things.
I think the design problem lies in the requirements you’re taking for granted. I’m not suggesting that just yeeting some untrusted IR into memory and executing it blindly would be a good idea. Rather I think that if that’s a thing you could do, you probably weren’t going to build a secure system. For example, why are we linking code from different trust domains?
Checking untrusted bytecode to see if it has anything nasty in it has the same vibe as checking form inputs to see if they have SQL injection attacks in them. This vibe, to be precise.
…Reading this reply back I feel like I’ve made it sound like a bigger deal than it is. I wouldn’t assume a thing was inherently terrible just because it had a bytecode verifier. I just think it’s a small sign that something may be wrong.
Honestly, I can’t really think of a different way, especially regarding type checking across boundaries. You have a square-shaped hole and you want to be able to plug there squares, but you may have gotten them from any place. There is no going around checking if random thing fits a square, parsing doesn’t apply here.
Also, plain Java byte code can’t do any harm, besides crashing itself, so it is not really the case you point at — a memory-safe JVM interpreter will be memory-safe. The security issue comes from all the capabilities that JVM code can access. If anything, this type checking across boundaries is important to allow interoperability of code, and it is a thoroughly under-appreciated part of the JVM I would say: there is not many platforms that allow linking together binaries type-safely and backwards compatibly (you can extend one and it will still work fine).
Honestly, I can’t really think of a different way, especially regarding type checking across boundaries. You have a square-shaped hole and you want to be able to plug there squares, but you may have gotten them from any place. There is no going around checking if random thing fits a square, parsing doesn’t apply here.
Also, plain Java byte code can’t do any harm, besides crashing itself, so it is not really the case you point at — a memory-safe JVM interpreter will be memory-safe. The security issue comes from all the capabilities that JVM code can access. If anything, this type checking across boundaries is important to allow interoperability of code, and it is a thoroughly under-appreciated part of the JVM I would say: there is not many platforms that allow linking together binaries type-safely and backwards compatibly (you can extend one and it will still work fine).
Well, how is this different from downloading and running JS? In both cases it’s untrusted code and you put measures in place to keep it from doing unsafe things. The JS parser checks for syntax errors; the JVM verifier checks for bytecode errors.
JVMs never “blindly executed” downloaded code. That’s what SecurityManagers are for. The verifier is to ensure the bytecode doesn’t break the interpreter; the security manager prevents the code from calling unsafe APIs. (Dang, I think SecurityManager might be the wrong name. It’s been soooo long since I worked on Apple’s JVM.)
I know there have been plenty of exploits from SecurityManager bugs; I don’t remember any being caused by the bytecode verifier, which is a pretty simple/straightforward theorem prover.
In my experience, it happens when I have built up enough infrastructure around the model that I can express myself declaratively rather than procedurally. Jumping to solving the problem tends to lead to frustration; it’s better to think about different ways of representing the problem and what sorts of queries are enabled or frustrated by those approaches for a while.
Let me stress that I think of it as a tool for thinking about a problem rather than for solving a problem. Once you have a concrete idea of how to solve a problem in mind—and if you are trying to trick it into being more efficient, you are already there—it is usually more convenient to express that in another language. It’s not a tool I use daily. I don’t have brand new problems every day, unfortunately.
Some logic puzzles lend themselves to pure Prolog, but many benefit from CLP or CHR. With logic puzzles specifically, it’s good to look at some example solutions to get the spirit of how to solve them with Prolog. Knowing what to model and what to omit is a bit of an art there. I don’t usually find the best solutions to these things on my own. Also, it takes some time to find the right balance of declarative and procedural thinking when using Prolog.
Separately, being frustrated at Prolog for being weird and gassy was part of the learning experience for me. I suppose there may have been a time and place when learning it was easier than the alternatives. But it is definitely easier to learn Python or any number of modern procedural languages, and the benefit seems to be greater due to wider applicability. I am glad I know Prolog and I am happy to see people learning it. But it’s not the best tool for any job today really—but an interesting and poorly-understood tool nonetheless.
I have an unexplored idea somewhere of using it to drive the logic engine behind an always on “terraform like” controller.
Instead of defining only the state you want, it allows you to define “what actions to do to get there”, rules of what is not allowed as intermediary or final states and even ordering.
Datalog is used for querying some databases (datomic, logica, xtdb). I think the main advantages claimed over SQL are that its simple to learn and write, composable, and some claims about more efficient joins which I’m skeptical about.
I went through my talk, hoping I’d done a decent job of conveying the main points, and at the end, someone in the audience stood up and said “but I can do all this in C++”.
Hah, I had the same experience with Chapel. Showed it to someone and their first response was “I can do this all in Julia.” They had exactly two days of experience with Julia.
I think just getting someone to adopt try something out based on a demo is just an intrinsically hard problem, and I wish I knew better ways to make demos more inspiring.
The most inspiring demo I’ve seen in a long time is Matthew Croughan’s “What Nix Can Do” demo. In my opinion, a big part of what makes this compelling is that he suggests at the beginning that he is not going to be able to tell you what Nix is, you’re going to have to see it. And then he shows you a variety of things before starting to take audience suggestions, which he can nimbly address on the spot.
I think resisting the idea to begin from taxonomy is a good idea. Calling Nushell “Nushell” strongly suggests to me that this is a shell, and I should think about it the way I think about bash and fish. Reading the article, it seems clear to me that I should try it as a complete novelty. This is much like how thinking of Nix a package manager leads to pain. We first need the audience to discard their pre-existing taxonomy, somehow, so they can see how the new thing enlarges their world and creates new categories rather than slotting into the existing ones directly.
I have noticed that, for myself, misunderstanding doesn’t usually feel like a failure to understand, and often leads to frustrating questions like the kind you two have fielded here.
Many times that I’ve seen discussions of new concepts for shells (and Nushell in particular), PowerShell is notably omitted. I wouldn’t be surprised if many of the sort of people who would think/write about shells don’t use PowerShell, or haven’t seen it since 1.0 or 2.0.
Yeah, my main complaint about PowerShell is what happened with it going open source. The only commits merged were Microsoft employees. When it came to language decisions, the community was ignored/rejected in favor of Microsoft employees’ opinions. This has lead to poor decisions in new syntax features that require more verbosity and syntax noise.
PowerShell was my introduction to professional development as an Windows systems engineer. I’ll cherish my mastery of it, but I really wish it ended up being more successful when it came to the cross-platform and OSS (governance) aspects. Still needing to use Windows PowerShell to use a lot of the admin cmdlets like RSAT/ActiveDirectory is a shame.
We first need the audience to discard their pre-existing taxonomy, somehow
I think this is an interesting idea. I’m giving a talk next week on nixos and am thinking through different ways to approach the subject. For example we configure services through module options, not the packages themselves.
I’ll admit I haven’t watched all of Matthew’s talk, but I’d be curious to know what the audience thought of it.
I’m not sure about the Julia case, but there are two good responses to ‘I can do this in C++’:
You can do it less verbosely with this tool. C++ is a low-level Turing complete language. It can do anything that the environment permits. But it can’t always do it in a small amount of code.
You can’t do these undesirable things with this tool. C++ makes it trivial to violate memory- and type-safety. A big part of the selling point for higher-level languages is that they make it harder to express programs with these categories of bugs.
Forgive the off-topic question, but I couldn’t find an answer on the site itself. Why is the “th” digraph represented as ð in some places and þ in others?
ð is the ‘th’ in the, this, that, other (voiced dental fricative)
þ is the ‘th’ in both, thing, thought, three (voiceless dental fricative)
Historically English used both symbols interchangeably, but most words (that aren’t “with”) don’t use the sounds interchangeably. This setup is also how Icelandic uses Ð & Þ. If English were to reintroduce these symbols (personally in favor), I would prefer seeing this setup as it disambiguates the sounds for ESL speakers/readers and gives English a strong typographic identity (think how ñ makes you immediately think Spanish) for one of it’s unique characteristics: using dental fricatives.
Noteworthy: English has words like Thailand, Thomas, Thames that have ‘th’ that aren’t dental fricatives which helps disambiguate those as well before we get another US president saying “Thighland” based on spelling.
Additionally ‘ð’ on its own is “the” allowing the definite article to have a single-symbol representation like the indefinite article “a”. (I made this up, but I like the symmetry).
A more historically authentic way to compress “the” into a single column would be to put the “e” atop the “th”-symbol… although I don’t know that that would render legibly on an eth, as opposed to overlying the eth’s ascender.
Yes. 😅 Historically “&” was a part of the alphabet, but throwing even more symbols onto a keyboard makes less sense if it can be helped. I suppose a twin-barred “ð” could work, but at smaller resolutions, good luck. I would still value ð being the base tho, since it is voiced & I think following the ð/þ has more benefit than choosing þ to have both voiced & voiceless sounds.
If curious, you can try to read that linked post where the whole content uses ð & þ. It doesn’t take long for it to ‘click’ & personally I think it reads smoothly, but for a broad audience (which that post is not), I wouldn’t put such a burden on the copy. But around the periphery & in personal stuff, I don’t mind being the change I would like to see.
I now have a slightly ridiculous desire to build a “shadow page” into my in-progress site generator that rewrites English to use this so that every post has a version available in this mode. It is surprisingly delightful!
It could get tricky maintaining because you’d it’s not as simple as s/th/þ/g. I’m actually a bit surprised someone enjoyed let alone preferred reading like that. I figured most would be annoyed.
Yeah, the fun part would be building some actual understanding of the underlying language issues. The dumb-but-probably-actually-works version is just a regex over words, where you can add to it over time. The smarter version would actually use some kind of stronger language-aware approach that actually has the relevant linguistic tokens attached. Fun either way!
(I suspect the number of people who appreciate this is indeed nearly zero, but… I and three of my friends are the kind of people to love it. 😂)
In modern English, th has two different sounds (think vs this) but before that we used proper letters to distinguish those two sounds. It would be þink and ðis if we still used them.
This is neat, though I continue to wish this effort were being put into stabilizing flakes upstream.
I don’t think I’ll ever use it, because when I want to pin a dependency, I want it pinned, not automatically updating, and flakes already provide that to my satisfaction with tag or commit-hash URLs. When I want new stuff, I just pick a newer tag and test if it works for everything I have installed.
But I can see how other people might have use cases for this.
It is sadly not new and partially comes from the whole flakes situation. Both as s symptom and a cause. Basically the people that keep having to maintain stuff and do the work so that it is well integrated are not the same as the one that keep presenting this stuff.
Graham does an immense amount of work within the community maintaining and extending existing software and infrastructure, as do many of the other Determinate employees.
there’s some upsetting stuff but as someone who’s been following this closely, I do see the community doing a better job of working together despite ideological differences than a few years ago. I would like things to be better but no technical community is perfect, there’s always stuff that needs to improve.
with that said, I would be doing you a disservice if your instincts are saying to run and I talk you out of it. you should trust your feelings on this stuff, they’re telling you important things.
Unfortunately I already find Nix too useful personally to run. :) But I am not entirely sure what to make of it. Especially with Eelco being both the force that generated flakes, and the force that is sort of cheesing out with FlakeHub, without really addressing the instability issues. It feels a little disingenuous, but at the same time, I’m at a pretty far remove so I’m hardly the best person to interpret the situation clearly.
I mean we are catching up years of stuff happening and being merged without support from the rest of the maintainers, half finished. This is part of trying to catch up. It will take time and a lot of effort from people that want to clean it up. This is just making visible what was left to do…
I ordered one a while back, but it wasn’t very comfortable (coming from a Kinesis Advantage) and the thumb cluster gave me thumb pains within days, so I returned it (they were very nice about returns).
I now have a Glove80 and it’s just great (bought another one for the office).
I’m a long time (20+ yrs) Advantage user, and I’m curious about other similar keyboards, but they are a) expensive and b) not immediately a huge improvement over the Advantage, so I’ll probably make it to 30 years on the same keyboard. I was vaguely curious about the Advantage 360, because I like the idea of adjusting the space between the halves, but again, expensive for maybe not any improvement? I wish there were a place I could lease a good keyboard for a month or two to decide if I like it.
The 360 is quite a garbage fire, I had a 360 Pro, but sold it. They replaced the Cherry Brown/Red switches by cheaper Gateron Browns, which have deeper actuation (more towards 3mm than 2mm) and I found it tiring to type on.
The 360 Pro uses ZMK, but has a lot of Bluetooth issues, especially connecting the halves. Someone on Reddit offers a switch replacement service and said that on one halve, the key well ribbon cable runs through the clearance zone of the Bluetooth antenna.
The non-Pro 360 initially had some nasty firmware issues, but I heard they fixed some of them.
It was quite a disappointment, given that it is even 200 Euro more expensive than the Advantage2. I switched back quite quickly to my Advantage2 with KinT, before getting a Glove80 (which I absolutely love, no Bluetooth issues, better keywell, better thumb cluster).
This is disappointing, but useful, information, thank you. I don’t care about Bluetooth, or custom firmware, so if it’s problematic, that’s probably done for the 360 for me.
I replaced my Advantage USB with an 360 Pro earlier this year and it’s such a mixed bag. I wanted the pro because I wanted Bluetooth, but honestly, I hate that you have to load custom firmware to do things that used to be built-in, like change Mac and Linux command keys. The custom firmware process they have tried to streamline as much as possible, but it still amounts to forking their Github repo, activating pipelines, making changes using their online editor, downloading files, and copying those files onto each half. This is a lot of work for something that used to be a hotkey.
The hookup between the two halves is glitchy, and it sometimes forgets how to connect. It needs to be charged every two weeks, which means the mild irritant of running two cables, one to each half, and is kind of an overnight job. And a few of the keys are not as easy to hit as they were on the Advantage. It’s also pretty easy to accidentally hit the “reprogram firmware” keys which puts in a mode where you have to power cycle it to get it to work again.
Work paid for it, or I’d be more irritated. For nearly $500, I think it should be way less annoying. I don’t think Bluetooth should be the discriminator between people who want a keyboard they can just use and people who really want to reprogram the whole thing. This wasn’t clear to me at all when I bought it.
In short, I have thought about returning to the one you have. This product does not have the same quality as their earlier products.
I suppose that you use the thumb cluster on the Glove80 regularly. Are you able to use all of the keys on it?
I use a Moonlander at home and I don’t ever use the red thumb button or the bottom one because of how uncomfortable it is (and having rewritten this now, I’m considering swapping the top right thumb button - return - and the middle left thumb button - backspace - because i use backspace so much more).
This is basically my biggest complaint about the keyboard. I guess I must have small hands?
I can reach all 6 per thumb, but I’d say 4 comfortably. On the Moonlander definitely one, maybe two? The issue of the Moonlander thumb cluster is not only that the keys are far away, but also that they are at a weird angle. They don’t follow the natural thumb arc.
I have fairly long fingers, and I still find it hard to reach the red buttons and also the most inside 3 keys (hovering over the home row)… I feel a layout where the special keys are on the outside, such as on regular QWERTY, is more comfortable to use.
Nope, it uses Choc v1 switches, so is only works with Choc v1 keycaps.
I haven’t felt the need to change the caps. They come with MCC cylindrical profile, which is really nice for column stagger keyboards, since you can easily slide up/down your fingers (I guess the best description is: a half-pipe for your fingers?).
I am somewhat stalled in my rollout of Nix because of issues with Python. I don’t really think the issues are Nix’s fault so much as Python’s, so I’m not totally convinced Guix would fix it. Nix has trouble with Python packages because Python packages often have system dependencies that aren’t stated anywhere overtly, the package just fails to install if they are missing. Nix requires you to specify that somewhere. The overlay concept works but you get caught in a slightly irritating fail-add detail-try again loop. Similarly with Maven, there are lots of gross side-effect-y and impure things that it does under the hood. There is no standard Nix solution to dealing with Maven because you have to kind of pick your battles.
I am more open to trying Guix now than I was a few months ago, so maybe I will see something the Guix folks have figured out that I am missing. But I fail to see how relying on language-X’s packaging system harder will address this. I’m not convinced that you can wish the problems away by relying more on the underlying build tools which are intrinsically disinterested in real repeatability and have no way of voicing their system-level dependencies.
I also find two of the issues hammered on here not particularly salient. The first is that IMO Nix is very well-documented: all three of the language, NixPkgs, and NixOS. However, a large project needs many kinds of documentation: reference, tutorial, deep study, and other sorts. Nix has quite good reference documentation, but for deep study and tutorial documentation it mostly relies on blogs. Thus there is a problem with “freshness,” especially in the documentation that newbies are the most likely to require. A secondary point of confusion here is that too many things are named “Nix” (the language, the system, and the standard packages NixPkgs) and this is confusing, because new users wind up at the NixOS site and think that they should be jumping into NixOS first, when actually they should be learning NixPkgs and Nix-the-language/Nix-the-environment, and NixOS is actually a bit of a niche concern.
The second not-particularly-salient issue is, IMO, the language itself. This is a somewhat uninformed opinion because I haven’t tried Guix yet, but as someone who knows a few functional languages I find Nix to be not particularly surprising. It has some odd conventions, but a lot of the argumentation for Guix seems to come from a place of Lisp supremacy, which will always be a divisive place to start from. I can handle Scheme, but it’s a hard sell to developers not using Emacs actively in their daily life.
The first is that IMO Nix is very well-documented: all three of the language, NixPkgs, and NixOS. However, a large project needs many kinds of documentation: reference, tutorial, deep study, and other sorts.
As a member of the Nix documentation team I would disagree with this. Nix has lots of documentation (Nix Reference Manual, Nixpkgs Manual, NixOS Manual, Nix Pills, and now nix.dev), but it’s not terribly focused or discoverable. The information you’re looking for at any given moment has an 85% chance of existing (as long as it’s not about flakes), but you may have to bounce between multiple sources to find it.
The documentation team is woefully understaffed (there’s on the order of ~5 people doing work), there’s an absolute shitload of material to sift through, and cultural issues that other people have referred to. We’ve only recently spun up efforts to write a tutorial series for new users (this is the part that I lead).
The other thing is that there’s a ton of beginners that want to help by writing tutorials, etc, but not enough experienced Nix users to guide, mentor, and focus those efforts.
What’s a useful way that volunteers can contribute to making Nix documentation better? I use Nix frequently and run into problems caused by lack of good documentation all the time, and I’d like to do what I can to make it better.
We have two weekly meetings for the main Documentation Team (details) and one meeting right before the Thursday meeting for the “Learning Journey Working Group” (which I lead) that’s focused on getting a tutorial series off the ground.
Note that most contributors are in Europe so the meetings are generally oriented towards their availability. RIP if you live in US Mountain (like me) or Pacific time zones.
I’m glad that the Nix documentation team disagrees with me on this, and I am glad to hear that you are working on it. I am somewhat accustomed to bouncing between different sources. I agree that it is not very focused or discoverable, and I’m glad you’re working on that. Thank you!
yeah, fundamentally, I’ve seen three core approaches to managing software complexity in the long term:
burn it to the ground. don’t use anything that’s large enough or old enough to be a maintenance burden.
encyst it. create wrapper layers whose job is to do the bare minimum to set up the inner layers and tell them to do their thing, but in a way that makes more sense to whoever wrote the outer system.
engage with it and work to clarify it and integrate it with concepts from beyond its scope.
the downsides of all three approaches should be pretty obvious, so I won’t belabor the point by getting into that now
at their best, nix and guix are trying to do (3). at their worst, they do (2), but I still prefer that they make the attempt rather than giving up before they start, as various container-centric ecosystems do.
I think, yes, relying too heavily on language-specific build systems risks falling into category (2). in particular, I think the need for somebody, at some point, to explicitly identify system-level dependencies is core to doing (3) properly. it is often the big hurdle when writing a nix derivation for something that hasn’t been packaged yet. I do notice Python being a particular offender in this regard (I have also had trouble with packaging Ruby, for similar reasons).
I would be more optimistic about #2, but my experience has been to treat containerization with caution, and I’ve still been burned by it. This is what made me enthusiastic about #3. A hot take on the problem I’m having is that you can get to partial success with approach #2 much faster and more easily than #3. But #3 promises a more complete success when you do get there, one that doesn’t leave as many problems in the field to be discovered in the future. So I haven’t given up yet.
Nix has trouble with Python packages because Python packages often have system dependencies that aren’t stated anywhere overtly, the package just fails to install if they are missing.
I want to highlight this, because I feel this is a weak point in almost every language package manager. At least, I haven’t seen one that even tries to address this. I have this same problem with Node.js and Rust.
For what it’s worth, Docker is the same in this regard. You end up in the same, slow feedback loop adding system dependencies.
Thanks for submitting! Since I’ve never use it but heard it being praised, would you mind sharing how you’re using it? Is it better than things like kopia/restic/borg?
Compared to the other tools, no idea, have never used those. Duplicity is easy and without duplicity you can still unpack the backup since it uses regular formats, not its own thing. Its also a lot older. Supports encryption. The only downside was that a large incremental backup would sometimes not fit in the cloud storage, since the metadata file would grow larger than the single file limit (+5GB). Not many customers hit that limit but if they did, an archive of the backup and a new one fixed it. Version 2 should fix that, splitting the metadata up. (https://bugs.launchpad.net/duplicity/+bug/385495).
I now use Deja Dup on my desktop as a backup, not the commmandline version. It often makes my entire system hang due to cpu and IO load…
I believe it is the backend of Deja Dup, which I used to use as my laptop backup. It would routinely require an hour or longer to back up my machine. I switched to Restic, which can back up my entire machine in a couple minutes, to the same device.
I wish Restic had a nice GUI like Deja Dup, but the performance difference is so stark I can’t imagine going back.
I find it profitable to think in terms of the “null program.” If I have to add a program for some reason, there’s additional cognitive load associated with it, in addition to the other points mentioned above. Somebody is going to have to learn about this program and comprehend when and how it is to be used. That’s a cost. So you have to be sure that your program is an improvement on the “null program” of not having this thing. Sometimes the null program is better.
I have a work-provided Yubikey. I consider myself a “consumer” when it comes to this technology—it’s not something I understand in great depth. I am thinking about getting a personal Yubikey to complement my usage of 1Password, because putting my Google Authenticator stuff in 1Password is very convenient but it doesn’t really amount to a second factor IMO, since if you compromise my 1Password you get all my second factors for free. However, I did notice something about limited storage on the device for these things, and it made me kind of pause my purchase. I’m not currently using “passkeys” for anything because I’m somewhat concerned about the portability story. The convenience on the Mac is great, but I don’t expect to be on Apple products for all time, or I would just use iCloud Keychain.
As a consumer who is curious about security products, the proliferation of choice here is rather confusing and I’m not altogether sure what the right thing to do is—besides it probably not being wise to just wander into a locked-in state with Apple.
I’m really dissapointed with the yubikey. It seems you still have to generate keys locally and load them onto the key. The Trezor generates its own keys so you can use it without trusting the host. The ssh and GPG support is also much better (suprisingly…)
You can definitely do on-device generation for PGP and X.509, they just recommend against it because the key can never be backed up if you generate on-device.
Yeah, hardware wallets are pretty much the only generally useful products the crypto bubble(s) produced. Now you can generate/store/use your own private keys, back them up locally by writing down a list of words, even fireproof this storage by punching them into steel plate, then rehydrate the private key on a different hardware wallet device if your first one dies. They also have copious internal storage to deal with the various apps, which can be used to store resident keys (presumably deterministically derived from your single private key).
Agreed. Ledger devices have a FIDO2 app that uses the base secret that you got from the device to do key negotiation. This emergency access feature is so killer. The deterministic derivations from a single private key bit is absolutely genius, and even makes it easy to have multiple wallets (just do another key derivation step).
I own yubikeys, and I would just continue a) using them to secure my keepassxc and b) do 2FA FIDO on websites that support it. And just wait till this whole thing sorted itself out. For example I hope for good wallet support inside KPXC, so I don’t have to mind how many slots my physical key would need for the 300+ websites I have in there.
1Password will handle Passkeys for you, but it’s not quite made it to the stable branch of 1Password yet. Once this is all settled in and stable, then anywhere 1P works, your passkeys will also work.
As for using a Yubikey for 1P that’s totally your decision.
Some of the open source hardware security options offer more key storage—mine has 12. However, it currently won’t be enough for this key issue tho if you need 100s–1000s of keys. Folks could develop open boards with a storage device attached to it for all those new keys, but that would need to be secured (and we still need easy sync for backups just like most folks have spare keys to their home).
XML had some interesting ideas going for it that HTML originally did not. Among these, the principal ones are that XML is easier to parse (my evidence for this claim is that there is a much larger proliferation of XML parsers than SGML parsers) and has a straightforward concept for combining different document types in one document. IMO, it turned out that none of the potential benefits of XML were as valuable as the recovery strategies browsers had already implemented for HTML, and XML was from conception much more interested in validity than recovery (it’s a more tractable problem anyway). The rest of the discussion is, I think, mostly speculation and misplaced anger. It was an interesting idea. It didn’t pan out.
XML is still a valuable technology for many other purposes, although it is rather unfashionable at the moment.
Please continue to use XHTML5 syntax wherever possible in your tooling. It costs you almost nothing (just generate sane structures and, yes, the space + / for otherwise self-closing tags). Yes, the HTML5 parsing algorithm exists, no it’s not universally implemented, plus being able to process content with general purpose XML tools instead of pre-processing is a win every time I can do it.
This odd idea that “XHTML died” just because XHTML5 replaced XHTML2 makes no sense to me.
What I do is write my view templates in a stricter xhtml style that must be well-formed at all steps, even when you include another file, it does so as a ast branch, not as a string.
It has caught lots of errors, including ones that would have previously broken things in prod. (One time I had a sign up form that would mysteriously not work sometimes and it turns out that there was a flipped around before the submit button. Instant error with my stricter check, mysterious random looking (there was a pattern to it just not immediately obvious from the initial bug reports) failure without it.)
One of the advantages for this kind of thing too is that the added redundancy of the xml style code is that the program can better detect problems. Just using the html recovery algorithm might ensure you get a result, but not necessarily the result you intended to get. The xml parser goes less guess work.
It costs you nothing? Really? Most realistic apps have third-party scripts and dendencies that will end up writing to the document. Any misplaced quote or tag will result in a big blank error screen.
There is no big blank error screen with XHTML5, and I’m not sure if any browser ever bothers to support the strict parsing mode anymore. No one ever used it because it was never properly compatible with IE.
I’m having trouble thinking of a browser that implements an XML parser but not an HTML 5 parser. Can you discuss what you mean by “HTML5 parsing is not universally implemented”?
The Python stdlib has an XML parser but you need an external library (BeautifulSoup) to parse HTML. The BEAM doesn’t have a working HTML parser at all (you can get by with mochiweb_html, but it is limited). I can name two XML parsers for C off the top of my head, but can’t find one for HTML in Debian’s repository.
Yes, but there is an HTML5 spec. It’s a PITA to implement, but if you want your language to be useable for parsing the web, you really should implement it. Go did.
I’m not sure about C/C++, but given that Firefox and Webkit/Blink are open source, it must be out there somewhere.
a special list of elements that can never have children, and therefore they self-close
This is one of biggest design flaws of HTML. Maybe you do not consider it a flaw, but it definitely degrades a potentially powerful tool to a single-purpose one. I like more XML than HTML, because XML is a universal meta-language and you can build any application (including web pages) upon it. XML separates syntax and semantics and you can build the tree from the text serialization, even if you do not (fully) understand the semantics of given format. There is just a simple abstract rule (every element has start and end tag or it is empty: />) and you do not need any list specific for given application, that tells you, which elements are „self-closed“.
I like more XML than HTML, because XML is a universal meta-language and you can build any application
SGML is also a “universal meta-language” and can build anything XML can, since XML is an SGML profile. And HTML, for a while, was an SGML application, just as XHTML was an XML application.
XML separates syntax and semantics and you can build the tree from the text serialization, even if you do not (fully) understand the semantics of given format
This is an amusing claim because, again, XML is an SGML profile, and makes use of the concept of fully-tagged SGML – in oversimplified terms, “base” SGML without using a DTD to enable any minimization features. And if you don’t care about checking for conformance to a DTD, you can just as easily parse fully-tagged SGML as you can DTD-less or schema-less XML.
There is just a simple abstract rule (every element has start and end tag or it is empty: />) and you do not need any list specific for given application, that tells you, which elements are „self-closed“.
But really, this and the previous point are getting into the distinction between well-formedness (the XML term for conformance to the base syntax of the meta-language; SGML often prefers “conforming” over “well-formed”) and validity (conforms to the grammer of a particular application).
Which is where the fun begins, because XML arguably introduced significant complications here. For example, it’s legal for two XML parsers to disagree on whether a given document is well-formed! This is not a weird rare hypothetical edge case, either, it’s something that’s extremely easy to accidentally trip over in, say, XHTML.
While I take your point, HTML hasn’t been a proper subset of SGML for awhile now, and requiring a DTD in order to parse is pretty intense vs what is possible with fully-tagged or XML.
I never said HTML still was an SGML application. It’s more useful, for the error-tolerant parsing model browsers historically adopted, not to try to associate HTML with any markup meta-language.
But the fact remains that SGML can “build” anything XML can build, since “what SGML can build” is a superset of “what XML can build”.
and requiring a DTD in order to parse is pretty intense vs what is possible with fully-tagged or XML.
And yet that’s the source of one of the gotchas I mentioned. It’s trivially easy to construct a document that one (validating) XML parser would accept and say is valid XHTML 1.0 Strict, but that another (non-validating) XML parser would error on. And both parsers would be correct!
The easiest way is to exploit the differences between validating and non-validating parsers. Make, say, an XHTML document containing a named entity defined in XHTML but not one of the five base XML entities; a validating parser will say it’s fine, and a non-validating parser will throw an error on the unrecognized entity (since it is not obligated to handle entities declared in the external subset).
One can argue about whether this is technically a well-formedness error (as opposed to some other unnamed category of error, XML 1.0 was not entirely clear on what category of error this actually is), but since the end result (document rejected with fatal parse error) is the same, one can also argue that the argument about what kind of error it is, is moot.
There is a meaningful difference between document validity and document well-formedness, and generally speaking consumers do know whether they are reaching for one or the other, so I don’t think the point is moot, although entity resolution is an interesting detail that hadn’t occurred to me.
I am aware of and understand the difference between well-formedness and validity, as should be fairly obvious from my comments in this thread.
But there is the open question of exactly what type of error it is when all of the following are true:
A non-validating parser is being used, and
It encounters an entity reference to an entity which is not one of the five predefined entities, and for which it has not previously found a declaration, and
The document is not “standalone”, and
The declaration of the entity exists in the external subset.
XML 1.1 §4.1, production 68, defines several constraints around entity references. This specific case (XHTML document containing a named entity reference from the XHTML DTD) obviously does not violate the legal-character, no-recursion, or in-DTD well-formedness constraints. And it does not violate the parsed-entity well-formedness constraint.
It also does not violate the entity-declared validation constraint, both because the entity is declared in a way that conforms to the validation constraint and because the parser is not validating anyway.
Which leaves us with one question remaining: is it a violation of the entity-declared well-formedness constraint? Well, the last line of the constraint’s definition seems to say both that this is a violation, and that in this specific case the rule regarding entity declaration is not a well-formedness constraint.
So we have something that is a constraint but neither a validation constraint nor a well-formedness constraint, and which somehow produces a fatal error, which most often results from a well-formedness constraint.
But as I already said, the effect is indistinguishable from a well-formedness error, so nailing down precisely which type of constraint and error is involved is just an academic exercise.
And getting back to the original point, this is just a single easy example of a way to produce an XML document that one parser will happily accept but another will instantly reject with a fatal error. XML is far more complex and difficult to do both reliably and portably than people like to accept, and for that reason and many others was never a great fit for the web’s primary markup language.
It doesn’t feel like an apples-to-apples comparison, because the problem above necessarily arises from differences between validating and non-validating parsers, which I expect to behave differently. I don’t really feel like this is a satisfying example of two parsers disagreeing about whether XML is well-formed. It is a good example of XML having more complexity than it appears to at first blush though.
Wirth’s Oberon system had a useful benchmark that was essentially, does adding this feature to the compiler make compiling the entire system slower? If so, it’s reverted. There’s a paper about Oberon floating around and one of the people working on it spent considerable time improving the symbol table type in the compiler from a linked list to something “better”, and then had to remove it because it slowed down compiling the overall system. It turned out that usually symbol tables were small.
Cody, an editor plugin for the open source AI coding assistant, Sourcegraph (“zero-retention data-sharing” accord to docs). Seems they offer a self-hosted option too in their ToS after skimming it. And the FAQs answers quite a bit about how and what is sent to the online LLMs (“with the caveat that snippets of code (up to 28 KB per request) will be sent to a third party cloud service (Anthropic by default, but can also be OpenAI)”). It’s a shame the (Neo)Vim + Emacs plugins aren’t a part of the MVP where these two mainstays always resist the industry’s $current_editor zeitgeist.
I feel like the intersection of “people who use NeoVim or Emacs” and “people who will pay for a code-based chat assistant” is probably a lot smaller than with IntelliJ and VSCode.
Frankly it’s nice to see stories like this precisely because we all waste huge quantities of time on what turn out to be simple and obvious problems — obvious once you understand them!
apart from just how much fun I had writing this, that was also a reason. I have some junior engineer friends that because of my title or years of experience, think me so “smart”… I keep telling them that the reality is that I just have made A LOT more mistakes that them!
One of the features shared by C and C++ is that they both standardised I/O APIs that are truly painful to use. The C standard I/O interfaces were created as a thin wrapper around UNIX’s I/O and are somehow much harder to use than the thing that they abstract over. For example, the OS and the FILE structure may both do buffering, so you have two notions of what a flush really means. The FILE is thread safe, but you need to explicitly lock it if you’re doing more than a single call and the atomicity is not always what you’d expect. Most C implementations have a FILE that writes to a string for sprintf and asprintf, but don’t usually expose how to create one (so you have a generic abstraction that abstracts over precisely one thing). And that’s before you even get to things like the in-band signalling that bit the author of this post.
I suspect this is why on BeOS every graphical application had two threads automatically, an “application” thread and a “display” thread. I think you were expected to do nothing from the UI except pass messages to the application thread. But I was a very poor programmer when I read about all this, so I could be mistaken. I would expect that if you had a realtime guarantee on such message passes, you would get a realtime application as a result—although this assumes certain things about your message queue which I don’t remember anything about at all.
I like to say that XML is something you inflict on others and not yourself. By this I mean, it is very good for interchange when you need all parties to agree on what constitutes a valid document. YAML and JSON are not great for this, although we use JSON for it a lot in practice for the same reason we use Python and not Haskell.
I have also deployed it against myself for odd situations where I have a weird superset of information from several systems. For instance, we were migrating a database from one schema to another for compatibility reasons. I wanted to track what the source tables/columns were and what the targets were, generate documentation about both the new schema and the mapping from the old to the new and generate automated migrations to create the new schema. I made a little XML file with a trivial format for this and wrote three XSL stylesheets to generate the outputs. XML is an interesting tool. Not perfect for every scenario but it comes in handy sometimes.
Which is what we have in almost every microservice-based architecture when responses from multiple services have to be combined. We could go one step further and say that a graph data format like Turtle should be used instead of a tree-based one like XML/JSON to replace a non-trivial n-way tree merging with a straightforward graph union.
Seems like exactly what we should do when designing microservice architectures.
Can you point me to Turtle? It’s not something I’ve heard of before.
https://w3c.github.io/rdf-primer/spec/#section-turtle
Thank you!
It’s a human-readable format, simplifying from N3/Notation3, to encode RDF tuples. I heart RDF and Linked Data and wish I had more excuses to use it!
I like AppleScript but I think calling it “easy to use” is a bit of a stretch. Each application could define their own syntax. In 2005, I had to write a bit of AppleScript to help install a mail proxy for a commercial spam filter; each mail client that supported AppleScript had a completely different dictionary for doing the same stuff. It was not always very obvious at all what syntax was the right one to use; you would click compile and then have a hard time figuring out what you had done wrong. There was also no real debugging. You just ran it and if it did what you wanted, great. Since even then AppleScript wasn’t widely used, applications would have bugs, or bits of the dictionary didn’t work, or there would be no obvious way of creating new items (accounts or whatever the app was concerned with).
I was also befuddled when Automator came out, but since I’m a programmer I never was really the target audience for these kinds of things. I had never even heard of Shortcuts before reading this article. It looks pretty neat, but again, no real way to debug things (I made a two-step shortcut that just said “Shortcut failed to execute”). I am glad that they haven’t given up on these technologies but I wonder what is so much better about Shortcuts over Automator that caused them to deprecate one and buy the other.
AppleScript is the only read-only language I’ve ever encountered, the opposite of Perl. I’ve never come across AppleScript written by other people that I found hard to understand, even before I learned any of the language. In contrast, writing a new AppleScript has always been a struggle for me.
This sentence is the one that raises my concerns:
So, it seems that Okta and GSuite were linked up here, perhaps by policy, so that having the Okta MFA token gave the attacker this user’s GSuite, which then gave them all of the OTPs along with everything else in Google Authenticator. The corporate GSuite account, in other words, had all of the corporate passwords in it, so all the attacker needed to get everything was to get in between one user’s Okta and their GSuite.
I sense that the neighboring IT professionals are probably torn between wanting to force a particular password service on their users (to prevent them from doing dumb shit like using Lastpass) and not wanting their entire class of users using systems they don’t really understand which might enable this kind of attack. And of course, not wanting them to use post-it notes either.
Okta is an Identity Provider. The entire point of the product is linking identities across systems. There are very good reasons for doing this.
Without an IdP, employees have to manage their own passwords across many systems. This makes taking an actual inventory during the offboarding process a nightmare. Active identities may be left lingering for years after someone leaves.
An IdP also gives you the ability to enforce policies like using MFA. If the employee manages their own identity, they can choose not to.
The real security hole in this scenario, IMO, is Google Authenticator. Since TOTP codes are simply a shared secret value, you’re essentially passing a plaintext password around. Once a TOTP secret is established, it should never be shared to any other system.
I always asked myself, ever since i got introduced to prolog at the early stages of my university module theoretical computer science and abstract datatypes - what would i use prolog for and why would i use it for that?
I know two use cases where prolog is used earnestly, both deprecated these days:
Gerrit Code Review allowed creating new criteria a change must fulfill before it can be submitted. Examples
SPARK, an Ada dialect, used prolog up to SPARK2005 (paper). Out of the formal verification annotations and the Ada code it created prolog facts and rules. With those, certain queries passed if (and only if) the requirements encoded in those annotations were satisfied. They since moved to third party SAT solvers, which allowed them to increase the subset of Ada that could be verified (at the cost of being probabilistic, given that SAT is NP-complete: a true statement might not be verified successfully, but a false statement never passes as true), so prolog is gone.
Datalog, which is essentially a restricted Prolog, has made a bit of a resurgence in program verification. There are some new engines like Soufflé designed specifically by/for that community. Not an example of Prolog per se, but related to your point 2.
Yes, Datalog is heavily used for building state-of-the-art static analyzers, see https://arxiv.org/abs/2012.10086.
This is a great comment, I never knew about the Gerrit code review criteria.
A way I’ve been thinking about it is what if my database was more powerful & less boilerplate.
A big one for me is why can’t I extend a table with a view in the same way prolog can share the same name for a fact & a predicate/rule. Querying prolog doesn’t care about if what im querying comes from a fact (table) or a predicate (view).
This in practice i think would enable a lot of apps to move application logic into the database, I think this is a great thing.
The industry as a whole disagrees with this vehemently. I’m not sure if you were around for the early days of RDBMS stored procedure hell, but there’s a reason they’re used fairly infrequently.
What went wrong with them?
It’s nearly impossible to add tests of any kind to a stored procedure is the biggest one, IMO.
We actually do stored procedures at work & test them via rspec but it sucks. Versioning them also sucks to deal with. And the language is terrible from most perspectives, i think primarily it sucks going to a LSP-less experience.
I think to me though the root suckiness is trying to put a procedural language side by side a declarative one.
This wasn’t what I was saying with views.
I do think databases could be more debuggable & prolog helps here because you can actually debug your queries with breakpoints and everything. Wish i could do that with sql.
EDIT: but we continue to use stored procedures (and expand on them) because its just so much faster performance-wise than doing it in rails, and I don’t think any server language could compete with doing analysis right where the data lives.
Stored procedures can absolutely be the correct approach for performance critical things (network traversal is sometimes too much), but it also really depends. It’s harder to scale a database horizontally, and every stored procedure eats CPU cycles and RAM on your DB host.
I agree, prolog != SQL and can be really nice which may address many of the issues with traditional RDBMS stored procedures.
Yeah. DBs typically have pretty horrible debugging experiences, sadly.
I feel that that this is a very US-coastal point of view, like one that is common at coastal start-ups and FAANG companies but not as common elsewhere. I agree with it for the most part, but I suspect there are lots of boring enterprise companies, hospitals, and universities, running SQL Server / mostly on Windows or Oracle stacks that use the stored procedure hell pattern. I would venture that most companies that have a job title called “DBA” use this to some extent. In any case I think it’s far from the industry as a whole
Nah, I started my career out at a teleco in the Midwest, this is not a SV-centric opinion, those companies just have shit practices. Stored procedures are fine in moderation and in the right place, but pushing more of your application into the DB is very widely considered an anti-pattern and has been for at least a decade.
To be clear, I’m not saying using stored procedures at all is bad, the issue is implementing stuff that’s really data-centric application logic in your database is not great. To be fair to GP, they were talking about addressing some of the things that make approaching thing that way suck
@ngp, but I think you are interpreting
somewhat narrowly.
Sure we do not want stored procs, but moving Query complexity to a database (whether it is an in-process-embedded database, or external database) is a good thing.
Queries should not be implemented manually using some form of a ‘fluent’ APIs written by hand. This is like writing assembler by hand, when optimizing compilers exists and work correctly.
These kinds of query-by-hand implementations within an app, often lack global optimization opportunities (for both query and data storage). If these by-hand implementations do include global optimizations for space and time - then they are complex, and require maintenance by specialized engineers (and that increases overall engineering costs, and may make existing system more brittle than needed).
Also, we should be using in-process databases if the data is rather static, and does not need to be distributed to other processes (this is well served by embedding prolog)
Finally, prolog-based query also includes defining ‘fitment tests’ declaratively. Then prolog query responds finding existing data items that ‘fits’ the particular fitment tests. And that’s a very valuable type of query for applications that need to check for ‘existence’ of data satisfying a set of, often complex, criteria.
Databases can also be more difficult to scale horizontally. It can also be more expensive if you’re paying to license the database software (which is relatively common). I once had the brilliant idea to implement an API as an in-process extension to the DB we were using. It was elegant, but the performance was “meh” under load, and scaling was more difficult since the whole DB had to be distributed.
I have a slightly different question: does anybody use prolog for personal computing or scripts? I like learning languages which I can spin up to calculate something or do a 20 line script. Raku, J, and Frink are in this category for me, all as different kinds of “supercalculators”. Are there one-off things that are really easy in Prolog?
I’d say anything that solves “problems” like Sudoku or these logic puzzles I don’t know the name of “Amy lives in the red house, Peter lives next to Grace, Grace is amy’s grandma, the green house is on the left, who killed the mayor?” (OK, I made the last one up).
When I planned my wedding I briefly thought about writing some Prolog to give me a list of who should sit at which table (i.e. here’s a group of 3, a group of 5, a group of 7 and the table sizes are X,Y,Z), but in the end I did it with a piece of paper and bruteforcing by hand.
I think it would work well for class schedules, I remember one teacher at my high school had a huge whiteboard with magnets and rumour was he locked himself in for a week before each school year and crafted the schedules alone :P
The “classical” examples in my Prolog course at uni were mostly genealogy and word stems (this was in computer linguistics), but I’m not sure if that would still make sense 20y later (I had a feeling in this particular course they were a bit behind the time even in the early 00s).
This class of problems has a few different names: https://en.wikipedia.org/wiki/Zebra_Puzzle
Wonder if it would be good for small but intricate scheduling problems, like vacation planning. I’d compare with minizinc and z3.
I’d be interested to see a comparison like this. I don’t really know z3, but my impression is that you typically call it as a library from a more general-purpose language like Python. So I imagine you have to be aware of how there are two separate languages: z3 values are different than Python native values, and some operations like
if/and/or
are inappropriate to use on z3 values because they’re not fully overloadable. (Maybe similar to this style of query builder.)By contrast, the CLP(Z) solver in Prolog feels very native. You can write some code thinking “this is a function on concrete numbers”, and use all the normal control-flow features like conditionals, or maplist. You’re thinking about numbers, not logic variables. But then it works seamlessly when you ask questions like “for which inputs is the output zero?”.
It’s really good for parsing thanks to backtracking. When you have configuration and need to check constraints on it, logic programming is the right tool. Much of classical AI is searching state spaces, and Prolog is truly excellent for that. Plus Prolog’s predicates are symmetric as opposed to functions, which are one way, so you can run them backwards to generate examples (though SMT solvers are probably a better choice for that today).
Prolog is both awesome and terrible for parsing.
Awesome: DCGs + backtracking are a killer combo
Terrible: If it fails to parse, you get a “No”, and nothing more. No indication of the row, col, falied token, nothing.
That subjectively resembles parser combinator libraries. I guess if you parse with a general-purpose language, even if the structure of your program resembles the structure of your sentences, you give up on getting anything for free; it’s impossible for a machine to say “why” an arbitrary program failed to give the result you wanted.
You can insert cuts to prevent backtracking past a certain point and keep a list of the longest successful parse to get some error information, but getting information about why the parse failed is hard.
And then your cuts are in the way for using the parser as a generator, thus killing the DCG second use.
I have used it to prototype solutions when writing code for things that don’t do a lot of I/O. I have a bunch of things and I want a bunch of other things but I’m unsure of how to go from one to the other.
In those situations it’s sometimes surprisingly easy to write the intermediary transformations in Prolog and once that works figure out “how it did it” so it can be implemented in another language.
Porting the solution to another language often takes multiple times longer than the initial Prolog implementation – so it is really powerful.
You could use it to define permissions. Imagine you have a web app with all kinds of rules like:
You can write down each rule once as a Prolog rule, and then query it in different ways:
Like a database, it will use a different execution strategy depending on the query. And also like a database, you can separately create indexes or provide hints, without changing the business logic.
For a real-world example, the Yarn package manager uses Tau Prolog–I think to let package authors define which version combinations are allowed.
When you have an appreciable level of strength with Prolog, you will find it to be a nice language for modeling problems and thinking about potential solutions. Because it lets you express ideas in a very high level, “I don’t really care how you make this happen but just do it” way, you can spend more of your time thinking about the nature of the model.
There are probably other systems that are even better at this (Alloy, for instance) but Prolog has the benefit of being extremely simple. Most of the difficulty with Prolog is in understanding this.
That hasn’t been my experience (I have written a non-trivial amount of Prolog, but not for a long time). Everything I’ve written in Prolog beyond toy examples has required me to understand how SLD derivation works and structure my code (often with red cuts) to ensure that SLD derivation reaches my goal.
This is part of the reason that Z3 is now my go-to tool for the kinds of problems where I used to use Prolog. It will use a bunch of heuristics to find a solution and has a tactics interface that lets my guide its exploration if that fails.
I don’t want to denigrate you, but in my experience, the appearance of red cuts indicates deeper problems with the model.
I’m glad you found a tool that works for you in Z3, and I am encouraged by your comment about it to check it out soon. Thank you!
I’m really curious if you can point me to a largish Prolog codebase that doesn’t use red cuts. I always considered them unavoidable (which is why they’re usually introduced so early in teaching Prolog). Anything that needs a breadth-first traversal, which (in my somewhat limited experience) tends to be most things that aren’t simple data models, requires red cuts.
Unfortunately, I can’t point you to a largish Prolog codebase at all, let alone one that meets certain criteria. However, I would encourage you to follow up on this idea at https://swi-prolog.discourse.group/ since someone there may be able to present a more subtle and informed viewpoint than I can on this subject.
I will point out that the tutorial under discussion, The Power of Prolog, has almost nothing to say about cuts; searching, I only found any mention of red cuts on this page: https://www.metalevel.at/prolog/fun, where Markus is basically arguing against using them.
So when does this happen? I’ve tried to learn Prolog a few times and I guess I always managed to pick problems which Prolog’s solver sucks at solving. And figuring out how to trick Prolog’s backtracking into behaving like a better algorithm is beyond me. I think the last attempt involved some silly logic puzzle that was really easy to solve on paper; my Prolog solution took so long to run that I wrote and ran a bruteforce search over the input space in Python in the time, and gave up on the Prolog. I can’t find my code or remember what the puzzle was, annoyingly.
I am skeptical, generally, because in my view the set of search problems that are canonically solved with unguided backtracking is basically just the set of unsolved search problems. But I’d be very happy to see some satisfying examples of Prolog delivering on the “I don’t really care how you make this happen” thing.
As an example, I believe Java’s class loader verifier is written in prolog (even the specification is written in a prolog-ish way).
class loader verifier? What a strange thing to even exist… but thanks, I’ll have a look.
How is that strange? It verifies that the bytecode in a function is safe to run and won’t underflow or overflow the stack or do other illegal things.
This was very important for the first use case of Java, namely untrusted applets downloaded and run in a browser. It’s still pretty advanced compared to the way JavaScript is loaded today.
I mean I can’t know from the description that it’s definitely wrong, but it sure sounds weird. Taking it away would obviously be bad, but that just moves the weirdness: why is it necessary? “Give the attacker a bunch of dangerous primitives and then check to make sure they don’t abuse them” seems like a bad idea to me. Sort of the opposite of “parse, don’t verify”.
Presumably JVMs as originally conceived verified the bytecode coming in and then blindly executed it with a VM in C or C++. Do they still work that way? I can see why the verifier would make sense in that world, although I’m still not convinced it’s a good design.
You can download a random class file from the internet and load it dynamically and have it linked together with your existing code. You somehow have to make sure it is actually type safe, and there are also in-method requirements that have to be followed (that also be type safe, plus you can’t just do pop pop pop on an empty stack). It is definitely a good design because if you prove it beforehand, then you don’t have to add runtime checks for these things.
And, depending on what you mean by “do they still work that way”, yeah, there is still byte code verification on class load, though it may be disabled for some part of the standard library by default in an upcoming release, from what I heard. You can also manually disable it if you want, but it is not recommended. But the most often ran code will execute as native machine code, so there the JIT compiler is responsible for outputting correct code.
As for the prolog part, I was wrong, it is only used in the specification, not for the actual implementation.
I think the design problem lies in the requirements you’re taking for granted. I’m not suggesting that just yeeting some untrusted IR into memory and executing it blindly would be a good idea. Rather I think that if that’s a thing you could do, you probably weren’t going to build a secure system. For example, why are we linking code from different trust domains?
Checking untrusted bytecode to see if it has anything nasty in it has the same vibe as checking form inputs to see if they have SQL injection attacks in them. This vibe, to be precise.
…Reading this reply back I feel like I’ve made it sound like a bigger deal than it is. I wouldn’t assume a thing was inherently terrible just because it had a bytecode verifier. I just think it’s a small sign that something may be wrong.
Honestly, I can’t really think of a different way, especially regarding type checking across boundaries. You have a square-shaped hole and you want to be able to plug there squares, but you may have gotten them from any place. There is no going around checking if random thing fits a square, parsing doesn’t apply here.
Also, plain Java byte code can’t do any harm, besides crashing itself, so it is not really the case you point at — a memory-safe JVM interpreter will be memory-safe. The security issue comes from all the capabilities that JVM code can access. If anything, this type checking across boundaries is important to allow interoperability of code, and it is a thoroughly under-appreciated part of the JVM I would say: there is not many platforms that allow linking together binaries type-safely and backwards compatibly (you can extend one and it will still work fine).
Honestly, I can’t really think of a different way, especially regarding type checking across boundaries. You have a square-shaped hole and you want to be able to plug there squares, but you may have gotten them from any place. There is no going around checking if random thing fits a square, parsing doesn’t apply here.
Also, plain Java byte code can’t do any harm, besides crashing itself, so it is not really the case you point at — a memory-safe JVM interpreter will be memory-safe. The security issue comes from all the capabilities that JVM code can access. If anything, this type checking across boundaries is important to allow interoperability of code, and it is a thoroughly under-appreciated part of the JVM I would say: there is not many platforms that allow linking together binaries type-safely and backwards compatibly (you can extend one and it will still work fine).
Well, how is this different from downloading and running JS? In both cases it’s untrusted code and you put measures in place to keep it from doing unsafe things. The JS parser checks for syntax errors; the JVM verifier checks for bytecode errors.
JVMs never “blindly executed” downloaded code. That’s what SecurityManagers are for. The verifier is to ensure the bytecode doesn’t break the interpreter; the security manager prevents the code from calling unsafe APIs. (Dang, I think SecurityManager might be the wrong name. It’s been soooo long since I worked on Apple’s JVM.)
I know there have been plenty of exploits from SecurityManager bugs; I don’t remember any being caused by the bytecode verifier, which is a pretty simple/straightforward theorem prover.
In my experience, it happens when I have built up enough infrastructure around the model that I can express myself declaratively rather than procedurally. Jumping to solving the problem tends to lead to frustration; it’s better to think about different ways of representing the problem and what sorts of queries are enabled or frustrated by those approaches for a while.
Let me stress that I think of it as a tool for thinking about a problem rather than for solving a problem. Once you have a concrete idea of how to solve a problem in mind—and if you are trying to trick it into being more efficient, you are already there—it is usually more convenient to express that in another language. It’s not a tool I use daily. I don’t have brand new problems every day, unfortunately.
Some logic puzzles lend themselves to pure Prolog, but many benefit from CLP or CHR. With logic puzzles specifically, it’s good to look at some example solutions to get the spirit of how to solve them with Prolog. Knowing what to model and what to omit is a bit of an art there. I don’t usually find the best solutions to these things on my own. Also, it takes some time to find the right balance of declarative and procedural thinking when using Prolog.
Separately, being frustrated at Prolog for being weird and gassy was part of the learning experience for me. I suppose there may have been a time and place when learning it was easier than the alternatives. But it is definitely easier to learn Python or any number of modern procedural languages, and the benefit seems to be greater due to wider applicability. I am glad I know Prolog and I am happy to see people learning it. But it’s not the best tool for any job today really—but an interesting and poorly-understood tool nonetheless.
I have an unexplored idea somewhere of using it to drive the logic engine behind an always on “terraform like” controller.
Instead of defining only the state you want, it allows you to define “what actions to do to get there”, rules of what is not allowed as intermediary or final states and even ordering.
All things that terraform makes hard rn.
Datalog is used for querying some databases (datomic, logica, xtdb). I think the main advantages claimed over SQL are that its simple to learn and write, composable, and some claims about more efficient joins which I’m skeptical about.
https://docs.datomic.com/pro/query/query.html#why-datalog has some justifications of their choice of datalog.
Datomic and XTDB see some real world use as application databases for clojure apps. Idk if anyone uses logica.
Hah, I had the same experience with Chapel. Showed it to someone and their first response was “I can do this all in Julia.” They had exactly two days of experience with Julia.
I think just getting someone to adopt try something out based on a demo is just an intrinsically hard problem, and I wish I knew better ways to make demos more inspiring.
The most inspiring demo I’ve seen in a long time is Matthew Croughan’s “What Nix Can Do” demo. In my opinion, a big part of what makes this compelling is that he suggests at the beginning that he is not going to be able to tell you what Nix is, you’re going to have to see it. And then he shows you a variety of things before starting to take audience suggestions, which he can nimbly address on the spot.
I think resisting the idea to begin from taxonomy is a good idea. Calling Nushell “Nushell” strongly suggests to me that this is a shell, and I should think about it the way I think about bash and fish. Reading the article, it seems clear to me that I should try it as a complete novelty. This is much like how thinking of Nix a package manager leads to pain. We first need the audience to discard their pre-existing taxonomy, somehow, so they can see how the new thing enlarges their world and creates new categories rather than slotting into the existing ones directly.
I have noticed that, for myself, misunderstanding doesn’t usually feel like a failure to understand, and often leads to frustrating questions like the kind you two have fielded here.
I don’t think I would go that far. It feels like PowerShell without the .NET to me.
Many times that I’ve seen discussions of new concepts for shells (and Nushell in particular), PowerShell is notably omitted. I wouldn’t be surprised if many of the sort of people who would think/write about shells don’t use PowerShell, or haven’t seen it since 1.0 or 2.0.
Yeah, my main complaint about PowerShell is what happened with it going open source. The only commits merged were Microsoft employees. When it came to language decisions, the community was ignored/rejected in favor of Microsoft employees’ opinions. This has lead to poor decisions in new syntax features that require more verbosity and syntax noise.
PowerShell was my introduction to professional development as an Windows systems engineer. I’ll cherish my mastery of it, but I really wish it ended up being more successful when it came to the cross-platform and OSS (governance) aspects. Still needing to use Windows PowerShell to use a lot of the admin cmdlets like RSAT/ActiveDirectory is a shame.
I think this is an interesting idea. I’m giving a talk next week on nixos and am thinking through different ways to approach the subject. For example we configure services through module options, not the packages themselves.
I’ll admit I haven’t watched all of Matthew’s talk, but I’d be curious to know what the audience thought of it.
I’m not sure about the Julia case, but there are two good responses to ‘I can do this in C++’:
Forgive the off-topic question, but I couldn’t find an answer on the site itself. Why is the “th” digraph represented as ð in some places and þ in others?
Noted in: https://toast.al/posts/techlore/2023-07-03_weblog-rebuilt
Historically English used both symbols interchangeably, but most words (that aren’t “with”) don’t use the sounds interchangeably. This setup is also how Icelandic uses Ð & Þ. If English were to reintroduce these symbols (personally in favor), I would prefer seeing this setup as it disambiguates the sounds for ESL speakers/readers and gives English a strong typographic identity (think how ñ makes you immediately think Spanish) for one of it’s unique characteristics: using dental fricatives.
Noteworthy: English has words like Thailand, Thomas, Thames that have ‘th’ that aren’t dental fricatives which helps disambiguate those as well before we get another US president saying “Thighland” based on spelling.
A more historically authentic way to compress “the” into a single column would be to put the “e” atop the “th”-symbol… although I don’t know that that would render legibly on an eth, as opposed to overlying the eth’s ascender.
Yes. 😅 Historically “&” was a part of the alphabet, but throwing even more symbols onto a keyboard makes less sense if it can be helped. I suppose a twin-barred “ð” could work, but at smaller resolutions, good luck. I would still value ð being the base tho, since it is voiced & I think following the ð/þ has more benefit than choosing þ to have both voiced & voiceless sounds.
Very interesting, thank you for the explanation!
If curious, you can try to read that linked post where the whole content uses ð & þ. It doesn’t take long for it to ‘click’ & personally I think it reads smoothly, but for a broad audience (which that post is not), I wouldn’t put such a burden on the copy. But around the periphery & in personal stuff, I don’t mind being the change I would like to see.
I now have a slightly ridiculous desire to build a “shadow page” into my in-progress site generator that rewrites English to use this so that every post has a version available in this mode. It is surprisingly delightful!
It could get tricky maintaining because you’d it’s not as simple as
s/th/þ/g
. I’m actually a bit surprised someone enjoyed let alone preferred reading like that. I figured most would be annoyed.You could do it ðe opposite way, where you write content wiþ þorns/eðs and automatically replace ðem wiþ “th”.
Ðat… is very not-dumb way to go about it. :)
Yeah, the fun part would be building some actual understanding of the underlying language issues. The dumb-but-probably-actually-works version is just a regex over words, where you can add to it over time. The smarter version would actually use some kind of stronger language-aware approach that actually has the relevant linguistic tokens attached. Fun either way!
(I suspect the number of people who appreciate this is indeed nearly zero, but… I and three of my friends are the kind of people to love it. 😂)
ð is voiced (“that”) and þ is unvoiced (“thing”). Feel your throat as you pronounce both and you’ll understand the difference.
In modern English, th has two different sounds (think vs this) but before that we used proper letters to distinguish those two sounds. It would be þink and ðis if we still used them.
This is neat, though I continue to wish this effort were being put into stabilizing flakes upstream.
I don’t think I’ll ever use it, because when I want to pin a dependency, I want it pinned, not automatically updating, and flakes already provide that to my satisfaction with tag or commit-hash URLs. When I want new stuff, I just pick a newer tag and test if it works for everything I have installed.
But I can see how other people might have use cases for this.
It’s happening: https://github.com/NixOS/rfcs/pull/136 And it’s a similar crowd working on both ends of this problem.
That is… Not the same crowd. Literally the author of this rfc is berating determinate systems in the forum thread announcing FlakeHub.
I think you’re referring to this thread which is very interesting. The disharmony on display is really disheartening to me, as a newish user of Nix.
It is sadly not new and partially comes from the whole flakes situation. Both as s symptom and a cause. Basically the people that keep having to maintain stuff and do the work so that it is well integrated are not the same as the one that keep presenting this stuff.
this is just patently incorrect.
Graham does an immense amount of work within the community maintaining and extending existing software and infrastructure, as do many of the other Determinate employees.
there’s some upsetting stuff but as someone who’s been following this closely, I do see the community doing a better job of working together despite ideological differences than a few years ago. I would like things to be better but no technical community is perfect, there’s always stuff that needs to improve.
with that said, I would be doing you a disservice if your instincts are saying to run and I talk you out of it. you should trust your feelings on this stuff, they’re telling you important things.
(edit: left out an important clause)
Unfortunately I already find Nix too useful personally to run. :) But I am not entirely sure what to make of it. Especially with Eelco being both the force that generated flakes, and the force that is sort of cheesing out with FlakeHub, without really addressing the instability issues. It feels a little disingenuous, but at the same time, I’m at a pretty far remove so I’m hardly the best person to interpret the situation clearly.
Belatedly: Having had a few days to digest this and see what’s being said about it elsewhere, I share that concern.
Not to be cynical, but that’s a an RFC for a plan to stabilize the CLI, with flakes still to get an RFC after all of that is done.
I mean we are catching up years of stuff happening and being merged without support from the rest of the maintainers, half finished. This is part of trying to catch up. It will take time and a lot of effort from people that want to clean it up. This is just making visible what was left to do…
I ordered one a while back, but it wasn’t very comfortable (coming from a Kinesis Advantage) and the thumb cluster gave me thumb pains within days, so I returned it (they were very nice about returns).
I now have a Glove80 and it’s just great (bought another one for the office).
I’m a long time (20+ yrs) Advantage user, and I’m curious about other similar keyboards, but they are a) expensive and b) not immediately a huge improvement over the Advantage, so I’ll probably make it to 30 years on the same keyboard. I was vaguely curious about the Advantage 360, because I like the idea of adjusting the space between the halves, but again, expensive for maybe not any improvement? I wish there were a place I could lease a good keyboard for a month or two to decide if I like it.
The 360 is quite a garbage fire, I had a 360 Pro, but sold it. They replaced the Cherry Brown/Red switches by cheaper Gateron Browns, which have deeper actuation (more towards 3mm than 2mm) and I found it tiring to type on.
The 360 Pro uses ZMK, but has a lot of Bluetooth issues, especially connecting the halves. Someone on Reddit offers a switch replacement service and said that on one halve, the key well ribbon cable runs through the clearance zone of the Bluetooth antenna.
The non-Pro 360 initially had some nasty firmware issues, but I heard they fixed some of them.
It was quite a disappointment, given that it is even 200 Euro more expensive than the Advantage2. I switched back quite quickly to my Advantage2 with KinT, before getting a Glove80 (which I absolutely love, no Bluetooth issues, better keywell, better thumb cluster).
I have both the regular and pro models of the advantage 360, and they both work quite well for me.
Definitely the firmware programming aspect with ZMK was a a hassle with the 360 pro.
I still prefer my advantage 2, but I’m mostly typing on my advantage 360 these days because the split keypads is really nice.
The build quality is excellent compared to for instance the Ergodox that I own, it’s up to part with my Advantage 2.
Just offering another anecdotal experience into the mix here.
This is disappointing, but useful, information, thank you. I don’t care about Bluetooth, or custom firmware, so if it’s problematic, that’s probably done for the 360 for me.
I replaced my Advantage USB with an 360 Pro earlier this year and it’s such a mixed bag. I wanted the pro because I wanted Bluetooth, but honestly, I hate that you have to load custom firmware to do things that used to be built-in, like change Mac and Linux command keys. The custom firmware process they have tried to streamline as much as possible, but it still amounts to forking their Github repo, activating pipelines, making changes using their online editor, downloading files, and copying those files onto each half. This is a lot of work for something that used to be a hotkey.
The hookup between the two halves is glitchy, and it sometimes forgets how to connect. It needs to be charged every two weeks, which means the mild irritant of running two cables, one to each half, and is kind of an overnight job. And a few of the keys are not as easy to hit as they were on the Advantage. It’s also pretty easy to accidentally hit the “reprogram firmware” keys which puts in a mode where you have to power cycle it to get it to work again.
Work paid for it, or I’d be more irritated. For nearly $500, I think it should be way less annoying. I don’t think Bluetooth should be the discriminator between people who want a keyboard they can just use and people who really want to reprogram the whole thing. This wasn’t clear to me at all when I bought it.
In short, I have thought about returning to the one you have. This product does not have the same quality as their earlier products.
This is a big bummer. Oh well.
I suppose that you use the thumb cluster on the Glove80 regularly. Are you able to use all of the keys on it?
I use a Moonlander at home and I don’t ever use the red thumb button or the bottom one because of how uncomfortable it is (and having rewritten this now, I’m considering swapping the top right thumb button - return - and the middle left thumb button - backspace - because i use backspace so much more).
This is basically my biggest complaint about the keyboard. I guess I must have small hands?
I can reach all 6 per thumb, but I’d say 4 comfortably. On the Moonlander definitely one, maybe two? The issue of the Moonlander thumb cluster is not only that the keys are far away, but also that they are at a weird angle. They don’t follow the natural thumb arc.
I have fairly long fingers, and I still find it hard to reach the red buttons and also the most inside 3 keys (hovering over the home row)… I feel a layout where the special keys are on the outside, such as on regular QWERTY, is more comfortable to use.
Does Glove80 work with high profile caps like SA? Is it even compatible with cherry stem caps? It looks like it might be slim chocks.
Nope, it uses Choc v1 switches, so is only works with Choc v1 keycaps.
I haven’t felt the need to change the caps. They come with MCC cylindrical profile, which is really nice for column stagger keyboards, since you can easily slide up/down your fingers (I guess the best description is: a half-pipe for your fingers?).
Should work fine with SA keycaps: the switches are regular MX stylehttps://www.zsa.io/moonlander/keyswitchesEr wups, the question was about the Glove80 not the Moonlander
That keyboard looks really interesting! Can you customize the keys of it like the Moonlander?
Yeah, you can try their layout editor here:
https://my.glove80.com/
They use the open source ZMK firmware.
I am somewhat stalled in my rollout of Nix because of issues with Python. I don’t really think the issues are Nix’s fault so much as Python’s, so I’m not totally convinced Guix would fix it. Nix has trouble with Python packages because Python packages often have system dependencies that aren’t stated anywhere overtly, the package just fails to install if they are missing. Nix requires you to specify that somewhere. The overlay concept works but you get caught in a slightly irritating fail-add detail-try again loop. Similarly with Maven, there are lots of gross side-effect-y and impure things that it does under the hood. There is no standard Nix solution to dealing with Maven because you have to kind of pick your battles.
I am more open to trying Guix now than I was a few months ago, so maybe I will see something the Guix folks have figured out that I am missing. But I fail to see how relying on language-X’s packaging system harder will address this. I’m not convinced that you can wish the problems away by relying more on the underlying build tools which are intrinsically disinterested in real repeatability and have no way of voicing their system-level dependencies.
I also find two of the issues hammered on here not particularly salient. The first is that IMO Nix is very well-documented: all three of the language, NixPkgs, and NixOS. However, a large project needs many kinds of documentation: reference, tutorial, deep study, and other sorts. Nix has quite good reference documentation, but for deep study and tutorial documentation it mostly relies on blogs. Thus there is a problem with “freshness,” especially in the documentation that newbies are the most likely to require. A secondary point of confusion here is that too many things are named “Nix” (the language, the system, and the standard packages NixPkgs) and this is confusing, because new users wind up at the NixOS site and think that they should be jumping into NixOS first, when actually they should be learning NixPkgs and Nix-the-language/Nix-the-environment, and NixOS is actually a bit of a niche concern.
The second not-particularly-salient issue is, IMO, the language itself. This is a somewhat uninformed opinion because I haven’t tried Guix yet, but as someone who knows a few functional languages I find Nix to be not particularly surprising. It has some odd conventions, but a lot of the argumentation for Guix seems to come from a place of Lisp supremacy, which will always be a divisive place to start from. I can handle Scheme, but it’s a hard sell to developers not using Emacs actively in their daily life.
As a member of the Nix documentation team I would disagree with this. Nix has lots of documentation (Nix Reference Manual, Nixpkgs Manual, NixOS Manual, Nix Pills, and now nix.dev), but it’s not terribly focused or discoverable. The information you’re looking for at any given moment has an 85% chance of existing (as long as it’s not about flakes), but you may have to bounce between multiple sources to find it.
The documentation team is woefully understaffed (there’s on the order of ~5 people doing work), there’s an absolute shitload of material to sift through, and cultural issues that other people have referred to. We’ve only recently spun up efforts to write a tutorial series for new users (this is the part that I lead).
The other thing is that there’s a ton of beginners that want to help by writing tutorials, etc, but not enough experienced Nix users to guide, mentor, and focus those efforts.
What’s a useful way that volunteers can contribute to making Nix documentation better? I use Nix frequently and run into problems caused by lack of good documentation all the time, and I’d like to do what I can to make it better.
We have two weekly meetings for the main Documentation Team (details) and one meeting right before the Thursday meeting for the “Learning Journey Working Group” (which I lead) that’s focused on getting a tutorial series off the ground.
Note that most contributors are in Europe so the meetings are generally oriented towards their availability. RIP if you live in US Mountain (like me) or Pacific time zones.
I’m glad that the Nix documentation team disagrees with me on this, and I am glad to hear that you are working on it. I am somewhat accustomed to bouncing between different sources. I agree that it is not very focused or discoverable, and I’m glad you’re working on that. Thank you!
yeah, fundamentally, I’ve seen three core approaches to managing software complexity in the long term:
burn it to the ground. don’t use anything that’s large enough or old enough to be a maintenance burden.
encyst it. create wrapper layers whose job is to do the bare minimum to set up the inner layers and tell them to do their thing, but in a way that makes more sense to whoever wrote the outer system.
engage with it and work to clarify it and integrate it with concepts from beyond its scope.
the downsides of all three approaches should be pretty obvious, so I won’t belabor the point by getting into that now
at their best, nix and guix are trying to do (3). at their worst, they do (2), but I still prefer that they make the attempt rather than giving up before they start, as various container-centric ecosystems do.
I think, yes, relying too heavily on language-specific build systems risks falling into category (2). in particular, I think the need for somebody, at some point, to explicitly identify system-level dependencies is core to doing (3) properly. it is often the big hurdle when writing a nix derivation for something that hasn’t been packaged yet. I do notice Python being a particular offender in this regard (I have also had trouble with packaging Ruby, for similar reasons).
so this is a long-winded way to agree with you :)
I think your analysis is spot on.
I would be more optimistic about #2, but my experience has been to treat containerization with caution, and I’ve still been burned by it. This is what made me enthusiastic about #3. A hot take on the problem I’m having is that you can get to partial success with approach #2 much faster and more easily than #3. But #3 promises a more complete success when you do get there, one that doesn’t leave as many problems in the field to be discovered in the future. So I haven’t given up yet.
I want to highlight this, because I feel this is a weak point in almost every language package manager. At least, I haven’t seen one that even tries to address this. I have this same problem with Node.js and Rust.
For what it’s worth, Docker is the same in this regard. You end up in the same, slow feedback loop adding system dependencies.
Thanks for submitting! Since I’ve never use it but heard it being praised, would you mind sharing how you’re using it? Is it better than things like kopia/restic/borg?
At a previous job I wrote a wrapper around duplicity to very easily backup a Linux vps to an openstack object store, including menu-based restore: http://web.archive.org/web/20230808083309/https://www.cloudvps.com/knowledgebase/entry/2453-cloudvps-linux-backup-object-store/ and that was used on over 9000 servers. https://github.com/CloudVPS/CloudVPS-Boss
Compared to the other tools, no idea, have never used those. Duplicity is easy and without duplicity you can still unpack the backup since it uses regular formats, not its own thing. Its also a lot older. Supports encryption. The only downside was that a large incremental backup would sometimes not fit in the cloud storage, since the metadata file would grow larger than the single file limit (+5GB). Not many customers hit that limit but if they did, an archive of the backup and a new one fixed it. Version 2 should fix that, splitting the metadata up. (https://bugs.launchpad.net/duplicity/+bug/385495).
I now use Deja Dup on my desktop as a backup, not the commmandline version. It often makes my entire system hang due to cpu and IO load…
Solid design decisions, thanks. Now that 2.0 is released, the bug should get closed soon hopefully.
I believe it is the backend of Deja Dup, which I used to use as my laptop backup. It would routinely require an hour or longer to back up my machine. I switched to Restic, which can back up my entire machine in a couple minutes, to the same device.
I wish Restic had a nice GUI like Deja Dup, but the performance difference is so stark I can’t imagine going back.
I find it profitable to think in terms of the “null program.” If I have to add a program for some reason, there’s additional cognitive load associated with it, in addition to the other points mentioned above. Somebody is going to have to learn about this program and comprehend when and how it is to be used. That’s a cost. So you have to be sure that your program is an improvement on the “null program” of not having this thing. Sometimes the null program is better.
I have a work-provided Yubikey. I consider myself a “consumer” when it comes to this technology—it’s not something I understand in great depth. I am thinking about getting a personal Yubikey to complement my usage of 1Password, because putting my Google Authenticator stuff in 1Password is very convenient but it doesn’t really amount to a second factor IMO, since if you compromise my 1Password you get all my second factors for free. However, I did notice something about limited storage on the device for these things, and it made me kind of pause my purchase. I’m not currently using “passkeys” for anything because I’m somewhat concerned about the portability story. The convenience on the Mac is great, but I don’t expect to be on Apple products for all time, or I would just use iCloud Keychain.
As a consumer who is curious about security products, the proliferation of choice here is rather confusing and I’m not altogether sure what the right thing to do is—besides it probably not being wise to just wander into a locked-in state with Apple.
I’m really dissapointed with the yubikey. It seems you still have to generate keys locally and load them onto the key. The Trezor generates its own keys so you can use it without trusting the host. The ssh and GPG support is also much better (suprisingly…)
You can definitely do on-device generation for PGP and X.509, they just recommend against it because the key can never be backed up if you generate on-device.
Yeah, hardware wallets are pretty much the only generally useful products the crypto bubble(s) produced. Now you can generate/store/use your own private keys, back them up locally by writing down a list of words, even fireproof this storage by punching them into steel plate, then rehydrate the private key on a different hardware wallet device if your first one dies. They also have copious internal storage to deal with the various apps, which can be used to store resident keys (presumably deterministically derived from your single private key).
Agreed. Ledger devices have a FIDO2 app that uses the base secret that you got from the device to do key negotiation. This emergency access feature is so killer. The deterministic derivations from a single private key bit is absolutely genius, and even makes it easy to have multiple wallets (just do another key derivation step).
I own yubikeys, and I would just continue a) using them to secure my keepassxc and b) do 2FA FIDO on websites that support it. And just wait till this whole thing sorted itself out. For example I hope for good wallet support inside KPXC, so I don’t have to mind how many slots my physical key would need for the 300+ websites I have in there.
1Password will handle Passkeys for you, but it’s not quite made it to the stable branch of 1Password yet. Once this is all settled in and stable, then anywhere 1P works, your passkeys will also work.
As for using a Yubikey for 1P that’s totally your decision.
Some of the open source hardware security options offer more key storage—mine has 12. However, it currently won’t be enough for this key issue tho if you need 100s–1000s of keys. Folks could develop open boards with a storage device attached to it for all those new keys, but that would need to be secured (and we still need easy sync for backups just like most folks have spare keys to their home).
XML had some interesting ideas going for it that HTML originally did not. Among these, the principal ones are that XML is easier to parse (my evidence for this claim is that there is a much larger proliferation of XML parsers than SGML parsers) and has a straightforward concept for combining different document types in one document. IMO, it turned out that none of the potential benefits of XML were as valuable as the recovery strategies browsers had already implemented for HTML, and XML was from conception much more interested in validity than recovery (it’s a more tractable problem anyway). The rest of the discussion is, I think, mostly speculation and misplaced anger. It was an interesting idea. It didn’t pan out.
XML is still a valuable technology for many other purposes, although it is rather unfashionable at the moment.
Please continue to use XHTML5 syntax wherever possible in your tooling. It costs you almost nothing (just generate sane structures and, yes, the space + / for otherwise self-closing tags). Yes, the HTML5 parsing algorithm exists, no it’s not universally implemented, plus being able to process content with general purpose XML tools instead of pre-processing is a win every time I can do it.
This odd idea that “XHTML died” just because XHTML5 replaced XHTML2 makes no sense to me.
What I do is write my view templates in a stricter xhtml style that must be well-formed at all steps, even when you include another file, it does so as a ast branch, not as a string.
It has caught lots of errors, including ones that would have previously broken things in prod. (One time I had a sign up form that would mysteriously not work sometimes and it turns out that there was a flipped around before the submit button. Instant error with my stricter check, mysterious random looking (there was a pattern to it just not immediately obvious from the initial bug reports) failure without it.)
One of the advantages for this kind of thing too is that the added redundancy of the xml style code is that the program can better detect problems. Just using the html recovery algorithm might ensure you get a result, but not necessarily the result you intended to get. The xml parser goes less guess work.
It costs you nothing? Really? Most realistic apps have third-party scripts and dendencies that will end up writing to the document. Any misplaced quote or tag will result in a big blank error screen.
Are you sure that costs you nothing?
There is no big blank error screen with XHTML5, and I’m not sure if any browser ever bothers to support the strict parsing mode anymore. No one ever used it because it was never properly compatible with IE.
I’m having trouble thinking of a browser that implements an XML parser but not an HTML 5 parser. Can you discuss what you mean by “HTML5 parsing is not universally implemented”?
Non-browsers need to parse HTML/XML too.
The Python stdlib has an XML parser but you need an external library (BeautifulSoup) to parse HTML. The BEAM doesn’t have a working HTML parser at all (you can get by with mochiweb_html, but it is limited). I can name two XML parsers for C off the top of my head, but can’t find one for HTML in Debian’s repository.
Even more so, most of the HTML parsers do not implement html5 yet, but just some custom guesses they made up back in the day.
Yes, but there is an HTML5 spec. It’s a PITA to implement, but if you want your language to be useable for parsing the web, you really should implement it. Go did.
I’m not sure about C/C++, but given that Firefox and Webkit/Blink are open source, it must be out there somewhere.
This is one of biggest design flaws of HTML. Maybe you do not consider it a flaw, but it definitely degrades a potentially powerful tool to a single-purpose one. I like more XML than HTML, because XML is a universal meta-language and you can build any application (including web pages) upon it. XML separates syntax and semantics and you can build the tree from the text serialization, even if you do not (fully) understand the semantics of given format. There is just a simple abstract rule (every element has start and end tag or it is empty:
/>
) and you do not need any list specific for given application, that tells you, which elements are „self-closed“.SGML is also a “universal meta-language” and can build anything XML can, since XML is an SGML profile. And HTML, for a while, was an SGML application, just as XHTML was an XML application.
This is an amusing claim because, again, XML is an SGML profile, and makes use of the concept of fully-tagged SGML – in oversimplified terms, “base” SGML without using a DTD to enable any minimization features. And if you don’t care about checking for conformance to a DTD, you can just as easily parse fully-tagged SGML as you can DTD-less or schema-less XML.
But really, this and the previous point are getting into the distinction between well-formedness (the XML term for conformance to the base syntax of the meta-language; SGML often prefers “conforming” over “well-formed”) and validity (conforms to the grammer of a particular application).
Which is where the fun begins, because XML arguably introduced significant complications here. For example, it’s legal for two XML parsers to disagree on whether a given document is well-formed! This is not a weird rare hypothetical edge case, either, it’s something that’s extremely easy to accidentally trip over in, say, XHTML.
While I take your point, HTML hasn’t been a proper subset of SGML for awhile now, and requiring a DTD in order to parse is pretty intense vs what is possible with fully-tagged or XML.
I never said HTML still was an SGML application. It’s more useful, for the error-tolerant parsing model browsers historically adopted, not to try to associate HTML with any markup meta-language.
But the fact remains that SGML can “build” anything XML can build, since “what SGML can build” is a superset of “what XML can build”.
And yet that’s the source of one of the gotchas I mentioned. It’s trivially easy to construct a document that one (validating) XML parser would accept and say is valid XHTML 1.0 Strict, but that another (non-validating) XML parser would error on. And both parsers would be correct!
Can you give an example of a document where two XML parsers disagree on whether it is well-formed? I can’t think of one.
The easiest way is to exploit the differences between validating and non-validating parsers. Make, say, an XHTML document containing a named entity defined in XHTML but not one of the five base XML entities; a validating parser will say it’s fine, and a non-validating parser will throw an error on the unrecognized entity (since it is not obligated to handle entities declared in the external subset).
One can argue about whether this is technically a well-formedness error (as opposed to some other unnamed category of error, XML 1.0 was not entirely clear on what category of error this actually is), but since the end result (document rejected with fatal parse error) is the same, one can also argue that the argument about what kind of error it is, is moot.
There is a meaningful difference between document validity and document well-formedness, and generally speaking consumers do know whether they are reaching for one or the other, so I don’t think the point is moot, although entity resolution is an interesting detail that hadn’t occurred to me.
I am aware of and understand the difference between well-formedness and validity, as should be fairly obvious from my comments in this thread.
But there is the open question of exactly what type of error it is when all of the following are true:
XML 1.1 §4.1, production 68, defines several constraints around entity references. This specific case (XHTML document containing a named entity reference from the XHTML DTD) obviously does not violate the legal-character, no-recursion, or in-DTD well-formedness constraints. And it does not violate the parsed-entity well-formedness constraint.
It also does not violate the entity-declared validation constraint, both because the entity is declared in a way that conforms to the validation constraint and because the parser is not validating anyway.
Which leaves us with one question remaining: is it a violation of the entity-declared well-formedness constraint? Well, the last line of the constraint’s definition seems to say both that this is a violation, and that in this specific case the rule regarding entity declaration is not a well-formedness constraint.
So we have something that is a constraint but neither a validation constraint nor a well-formedness constraint, and which somehow produces a fatal error, which most often results from a well-formedness constraint.
But as I already said, the effect is indistinguishable from a well-formedness error, so nailing down precisely which type of constraint and error is involved is just an academic exercise.
And getting back to the original point, this is just a single easy example of a way to produce an XML document that one parser will happily accept but another will instantly reject with a fatal error. XML is far more complex and difficult to do both reliably and portably than people like to accept, and for that reason and many others was never a great fit for the web’s primary markup language.
It doesn’t feel like an apples-to-apples comparison, because the problem above necessarily arises from differences between validating and non-validating parsers, which I expect to behave differently. I don’t really feel like this is a satisfying example of two parsers disagreeing about whether XML is well-formed. It is a good example of XML having more complexity than it appears to at first blush though.
This seems like a useful tool to radically improve programming language design. You might design a language in a red-green-refactor TDD-like way:
Pareto optimally hill-climb into a local maxima programming language. 🫣
Wirth’s Oberon system had a useful benchmark that was essentially, does adding this feature to the compiler make compiling the entire system slower? If so, it’s reverted. There’s a paper about Oberon floating around and one of the people working on it spent considerable time improving the symbol table type in the compiler from a linked list to something “better”, and then had to remove it because it slowed down compiling the overall system. It turned out that usually symbol tables were small.
Cody, an editor plugin for the open source AI coding assistant, Sourcegraph (“zero-retention data-sharing” accord to docs). Seems they offer a self-hosted option too in their ToS after skimming it. And the FAQs answers quite a bit about how and what is sent to the online LLMs (“with the caveat that snippets of code (up to 28 KB per request) will be sent to a third party cloud service (Anthropic by default, but can also be OpenAI)”). It’s a shame the (Neo)Vim + Emacs plugins aren’t a part of the MVP where these two mainstays always resist the industry’s $current_editor zeitgeist.
I feel like the intersection of “people who use NeoVim or Emacs” and “people who will pay for a code-based chat assistant” is probably a lot smaller than with IntelliJ and VSCode.
This one hit home for me since the Big Work Project is basically this not-pipeline system. And yes, I made all the mistakes that type-1 makes.
Frankly it’s nice to see stories like this precisely because we all waste huge quantities of time on what turn out to be simple and obvious problems — obvious once you understand them!
apart from just how much fun I had writing this, that was also a reason. I have some junior engineer friends that because of my title or years of experience, think me so “smart”… I keep telling them that the reality is that I just have made A LOT more mistakes that them!
One of the features shared by C and C++ is that they both standardised I/O APIs that are truly painful to use. The C standard I/O interfaces were created as a thin wrapper around UNIX’s I/O and are somehow much harder to use than the thing that they abstract over. For example, the OS and the FILE structure may both do buffering, so you have two notions of what a flush really means. The FILE is thread safe, but you need to explicitly lock it if you’re doing more than a single call and the atomicity is not always what you’d expect. Most C implementations have a FILE that writes to a string for sprintf and asprintf, but don’t usually expose how to create one (so you have a generic abstraction that abstracts over precisely one thing). And that’s before you even get to things like the in-band signalling that bit the author of this post.