Is this the first big Go project at Microsoft? I had no idea they could choose Go, I would assume that it just wouldn’t be acceptable, because they wouldn’t want to unnecessarily legitimize a Google project.
They had already lost the web platform by the time they switched to Chromium.
When I think of MS and ‘developers developers developers develop’, I was lead to believe that they’d have more pride in their own development platforms.
I wonder if the switch to Chromium, the embrace of Linux in Azure, the move from VS to VSCode, any of these, would have happened under Ballmer or Gates. I suspect Gates’ new book is like Merkel’s: retrospective but avoiding the most controversial questions: Russian gas and giving up dogfooding. Am I foolish to expect a more critical introspection?
I think corporate open source is now “proven”, and companies don’t worry so much about this kind of thing.
Google has used TypeScript for quite awhile now (even though they had their own Closure compiler, which was not staffed consistently, internally)
All of these companies are best thought of as “frenemies” :) They cooperate in some ways, and compete in others.
They have many common interests, and collaborate on say lobbying the US govt
When Google was small, and Microsoft was big, the story was different. They even had different employee bases.
But now they are basically peers, and employees go freely back and forth
I can see a blog post saying they made changes for FIPS compliance, which is usually understood to be for satisfying government requirements and not a security improvement. I can’t see a summary of any other changes.
MS is a pretty big company, so I wouldn’t be surprised if they have a ton of internal services in Go. Plus with Azure being a big part of their business, I doubt they care what language people use as long as you deploy it on their platforms.
Can confirm that some internal services were written in go as of a couple years ago. (I worked for Microsoft at the time.) I didn’t have a wide enough view to know if it was “a ton,” though.
I’m too lazy to look up a source but some of the error messages I’ve seen in azure make it clear Microsoft uses go to build their cloud.
Message like this one:
dial tcp 192.168.1.100:3000: connect: timeout
Keep in mind Microsoft also maintains their own go fork for fips compliance. I don’t know exactly their use case but I can tell you at Google we did the same thing for the same reason. I think the main fips compliant go binaries being shipped were probably gke/k8s related.
(Edit removed off topic ramble, sorry I have ADHD)
All of the demos I’ve seen so far of “vibe coding” have all been videogames. In games there’s more wiggle room for being “almost correct”: the game’s still playable if your sprite is bigger than you want. How well does vibe coding work on a large, complex task with stricter requirements?
(By “vibe coding”, I’m sharing Simon’s definition of not even looking at the code. Using it as an augment doesn’t count!)
It occurred to me that in my view, “software development” is for products I’m most aware of: database backed line-of-business applications with a ton of hooks for APIs, reporting, mobile views etc.
But for a lot of people, “software development” probably means: create a mobile app and hope it goes viral on an app store. Very different lifetime and support requirements, as well as different revenue streams.
Whether GenAI is appropriate for either is outside my expertise.
Oh man I didn’t even notice the “unsubscribe at any time”. Does asking for the code automatically put you on his newsletter? That’s a dark pattern alright.
Do the other people have to live with January 2013 for the rest of their lives? Or is it only engineering that has to deal with every dirty hack since the beginning of the organization? I wonder if non-engineering parts of a company properly recognize this “code is forever” situation.
I think the answer to “is this something totally unique to software engineering?” is almost always “no.” One example: ever notice how in lots of public libraries, the programming section is next to the astrology section? It’s because the Dewey Decimal system didn’t have any room in the technology block for new topics, so everything computer science had to be shoved in the miscellaneous section (next to paranormal). Libraries have to live with January 1876 for the rest of their lives!
It’s interesting to me that they never played Slay the Spire until 2023! It’s my favorite game of all time and I consider it largely foundational to the genre of modern deck-building roguelikes.
Thank goodness I avoided playing it until now because I surely would have just copied their incredible design (intentionally or subconsciously).
Really interesting perspective and way to approach game development! Ignore the herd and do your own thing, then go back and explore the genre. Kinda the inverse of what I expected.
That’s what leapt out to me too. If he played other games from the start, Balatro wouldn’t be nearly so unique. If he hadn’t played other games at the end, it wouldn’t have had the stake system, which makes the game so much more replayable.
Research makes sense for exploring the design space for solving a specific design problem (i.e. “I know I want a mechanic/dynamic for X. What mechanics have other designers used for this?”), which should really only happen after you’ve gotten through the first steps of, “What experience am I trying to make? What mechanics would I need to generate that experience?” It’s how I prefer to do my game design as well, and I’m much happier with the systems I’ve made as a result. You get more novel ideas that way.
Doing things in reverse is how you get a preponderance of clones (Hollow Knight, for instance, spawning a zillion Metroidvanias that share certain key traits with HK) where people want to make a game that’s Just Like The Thing I Like. They take a package of mechanics based on something they like and flex one or two of them. It makes something slightly new, but not wholly new, and ultimately it feels very same-y to the source material.
This is fun! Up until now the only succinct data I’d heard of was the “succinct circuit”. Imagine we have a 10 node graph where all edges exist (“complete”) except the edge 1->2. You could describe it in a conventional way (adjacency list, edge dict, matrix) or you could describe it as a function:
is_edge(x, y) = !(x == 1 && y == 2)
Succinct circuits can only represent a subset of graphs, but they use a fraction of the space. After all, they’re succinct!
Here’s the really interesting bit to me: checking if an edge exists is ~O(1) in a matrix. In a circuit, it’s P-complete. And this “lifts” the complexity of graph algorithms by one tier. P-complete problems become EXPTIME-complete, NP-complete problems become NEXPTIME-complete. I find that super cool!
I’ll make an attempt, with the caveat that this list seems so obvious to me that I’m worried I might be missing some nuance (imagine a similar list about cooking utensils with “people think knives can only be used for butter, but in reality they can also be used to cut bread, meat, and even vegetables!!!”).
Sentences in all languages can be templated as easily as in English: {user} is in {location} etc.
Both the substitutions and the surrounding text can depend on each other. The obvious example is languages where nouns have gender, but you might also have cases like Japanese where “in” might be へ, で, or に to indicate relative precision of the location.
Words that are short in English are short in other languages too.
German is the classic example of using lengthy compound words where English would use a shorter single-purpose word, “Rindfleisch” vs “beef” or “Lebensmittel” vs “food” (why yes I haven’t had lunch yet, why do you ask…?).
For any text in any language, its translation into any other language is approximately as long as the original.
See above – English -> German tends to become longer, English -> Chinese tends to become shorter.
For every lower-case character, there is exactly one (language-independent) upper-case character, and vice versa.
Turkish and German are famous counter-examples, with Turkish 'i' / 'I' being different letters, or German ß capitalizing to "SS" (though I think this is now considered somewhat old-fashioned?).
The lower-case/upper-case distinction exists in all languages.
Not true in Chinese, Japanese, Korean.
All languages have words for exactly the same things as English.
Every language has words that don’t exist in any other language. Sometimes because the concept is alien (English has no native word for 寿司), sometimes because a general concept has been subdivided in a different way (English has many words for overcast misty weather that don’t translate easily into languages from drier climates).
Every expression in English, however vague and out-of-context, always has exactly one translation in every other language.
I’m not sure what this means because many expressions in English don’t even have a single explanation in English, but in any case, idioms and double entendres often can’t be translated directly.
All languages follow the subject-verb-object word order.
If one’s English to SVO order is limited, limited too must their knowledge of literature be.
When words are to be converted into Title Case, it is always the first character of the word that needs to be capitalized, in all languages.
Even English doesn’t follow a rule of capitalizing the first character of every word. Title Casing The First Letter Of Every Word Is Bad Style.
Every language has words for yes and no.
One well-known counter-example being languages where agreement is by repeating a verb:
A: “Do you want to eat lunch together?”
B: “Eat.”
In each language, the words for yes and no never change, regardless of which question they are answering.
See above.
There is always only one correct way to spell anything.
Color / colour, aluminum / aluminium
Each language is written in exactly one alphabet.
Not sure exactly what this means – upper-case vs lower-case? Latin vs Cyrillic? 漢字 vs ひらがな カタカナ ? 简化字 vs 繁体字 ? Lots of counter-examples to choose from, Kazakh probably being a good one.
All languages (that use the Latin alphabet) have the same alphabetical sorting order.
Some languages special-case ordering of letter combinations, such as ij in Dutch.
And then there’s the dozens of European languages that have their own letters outside the standard 26. Or diacritics.
All languages are written from left to right.
Arabic, Hebrew.
Even in languages written from right to left, the user interface still “flows” from left to right.
Not sure what “flows” means here, but applications with good RtL support usually flip the entire UI – for example a navigational menu that’s on the right in English would be on the left in Arabic.
Every language puts spaces between words.
Segmenting a sentence into words is as easy as splitting on whitespace (and maybe punctuation).
Chinese, Japanese.
Segmenting a text into sentences is as easy as splitting on end-of-sentence punctuation.
English: "Dear Mr. Smith".
No language puts spaces before question marks and exclamation marks at the end of a sentence.
No language puts spaces after opening quotes and before closing quotes.
French famously has rules that differ from English regarding spacing around punctuation.
All languages use the same characters for opening quotes and closing quotes.
“ ” in English,「 」in Japanese, « » in French,
Numbers, when written out in digits, are formatted and punctuated the same way in all languages.
European languages that use '.' for thousands separator and ',' for the fractional separator, or languages that group by different sizes (like lakh/crore in Indian languages).
No two languages are so similar that it would ever be difficult to tell them apart.
Many languages are considered distinct for political reasons, even if a purely linguistic analysis would consider them the same language.
Languages that have similar names are similar.
English (as spoken in Pittsburgh), English (as spoken in Melbourne), and English (as spoken in Glasgow).
More seriously, Japanese and Javanese.
Icons that are based on English puns and wordplay are easily understood by speakers of other languages.
Often they’re difficult to understand even for English speakers (I once saw a literal hamburger used to signify a collapsable sidebar).
Geolocation is an accurate way to predict the user’s language.
Nobody who has ever travelled would think this. And yet. AND YET!
C’mon Google, I know that my IP is an airport in Warsaw but I really don’t want the Maps UI to switch to Polish when I’m trying to find a route to my hotel.
Country flags are accurate and appropriate symbols for languages.
You can roughly gauge where you are in the world by whether the local ATMs offer “🇬🇧 English”, “🇺🇸 English”, or “🇦🇺 English”.
Every country has exactly one “national” language.
Belgium, Luxembourg, Switzerland.
Every language is the “national” language of exactly one country.
Turkish and German are famous counter-examples, with Turkish ‘i’ / ‘I’ being different letters, or German ß capitalizing to “SS” (though I think this is now considered somewhat old-fashioned?).
The German ß has history.
The old rule is that ß simply has no uppercase. Capitalizing it as “SS” was the default fallback rule if you had to absolutely capitalize everything and the ß would look bad (such as writing “STRAßE” => “STRASSE”). Using “SZ” was also allowed in some cases.
The new rule is to use the uppercase ß: ẞ. So instead of “STRASSE” you now write “STRAẞE”.
The usage of “SZ” was disallowed in 2006, the East Germans had an uppercase ß since 1957, the West German rules basically said “Uppercase ß is in development” and that was doppred in 1984 for the rule to use SS or SZ as uppercase variant. The new uppercase ß is in the rules since 2017. And since 2024 the uppercase ß is now preferred over SS.
The ISO DIN 5008 was updated in 2020,
This means depending on what document you’re processing, based on when it was created and WHERE it was created, it’s writing of the uppercase ß may be radically different.
It should also be noted that if you’re in Switzerland, ß is not used at all, here the SS substitute is used even in lower case.
Family names may also have custom capitalization rules, where ß can be replaced by SS, SZ, ẞ
or even HS, so “Großman” can become “GROHSMANN”. Note that this depends on the person, while Brother Großmann may write “GROHSMANN”, Sister Großmann may write “GROSSMANN” and their mother may use “GROẞMANN” and these are all valid and equivalent.
Umlauts may also be uppercased without the diacritic umlaut and with an E suffix; ä becomes “AE”. In some cases even lowercase input does the translation because older systems can’t handle special characters, though this is not GDPR compliant.
No two languages are so similar that it would ever be difficult to tell them apart.
Many languages are considered distinct for political reasons, even if a purely linguistic analysis would consider them the same language.
If you ever want to have fun, the politics and regionality of German dialects could be enough to drive some linguists up the wall.
Bavarian is recognized as a language and dialect at the same time, it can be subdivided into dozens and dozens of subdialects, which are all similar but may struggle to understand eachother.
As someone who grew up in Swabian Bavaria, my dialect is a mix of both Swabian and Bavarian, I struggle to understand Northern Bavaria but I struggle much less with Basel Swiss Germany (which is distinct from Swiss German in that it originates from Lower Allemans instead of Higher Allemans) which is quite close in a lot of ways.
And the swiss then double down on making things confusing by sometimes using french language constructs in german words, or straight up importing french or italian words.
East Germany added the uppercase ß in 1957 and removed it in 1984. The spelling rules weren’t updated, so despite the presence of an uppercase ß, it would have been wrong to use it in any circumstances. Since Unicode 1.0 is somewhere around 1992, with some early drafts in 1988, it basically missed the uppercase ß being in the dictionary.
The uppercase ß itself has been around since 1905 and we’ve tried to get it into Unicode since roughly 2004.
Every expression in English, however vague and out-of-context, always has exactly one translation in every other language.
I’m not sure what this means because many expressions in English don’t even have a single explanation in English, but in any case, idioms and double entendres often can’t be translated directly.
A good example of this is a CMS I used to work on. The way it implemented translation was to define everything using English[0], then write translations as a mapping from those English snippets to the intended language. This is fundamentally flawed, e.g. by homonyms:
Subject From Flags Actions
----------------------------------------------------------------
Project update Alice Unread, Important [Read] [Delete]
Welcome HR Read [Read] [Delete]
Here, the “Read” flag means “this has been read”, whilst the “Read” button means “I want to read this”. Using the English as a key forces the same translation on both.
[0] We used British English, except for the word “color”; since we felt it was better to match the CSS keywords (e.g. when defining themes, etc.).
One trick is to use a different word on the asset: Reviewed(adj) and Review(v) don’t have the same problem that Read(adj) and Read(v) do. Seen(adj) and See(v); Viewed(adj) and View(v). And so on. Then you can “translate” to English to actually use Unread/Read/[Read] if you still like it without confusing the translator who need to know you want more like e.g. Lido/Ler or 阅读/显示 and so on.
The number of exceptions caused by the Hebrew calendar makes me shed a tear of joy.
Here’s one falsehood they missed: the length of a year varies by at most one day. True in Gregorian calendar, apparently true in the Islamic calendar, but not true in the Hebrew calendar: leap years are 30 days longer than regular years.
They sorta cover it on the “days” section, by way of mentioning that the Hebrew calendar has leap months.
They also miss Byzantine calendars which are still used by many churches, related to the Jewish Greek calendar from the Septuagint. It’s of course complicated by the fact that many churches & groups do not agree on what year was the start, so it’s complex to use (but still in somewhat fairly common liturgical use).
Here’s a fun (counter)example of (something like) this one from my heritage language:
In each language, the words for yes and no never change, regardless of which question they are answering.
(Context: the word for enjoy/like is the same in the language, so when translating to English, I choose whichever sounds most natural in each given example sentence.)
When someone says, “do you (enjoy/)like it?”, if you want to say “yes, I like it”, that’s fine, but if you want to say you don’t like it, you would say, “I don’t want it”; if you were to say, “I don’t like it” in that situation, it would mean, “I don’t want it”. The same reversal happens if they ask, “do you want it?”, and you want to respond in the negative.
So someone would say, “do you want a chocolate bar?”, and you’d say, “no, I don’t want it”, and that would mean, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”, whereas, “no, I don’t enjoy it” would just straightforwardly mean, “I don’t want it”.
(You can also respond with just, “no!” instead of using a verb in the answer.)
This only happens in the habitual present form. Someone might ask, “do you like chocolate?” before they offer you some, and you can say, “no, I don’t want it”, but if they already gave you a chocolate bar to try, they may ask, “did you like it?” in the past tense, and you’d have to respond with, “I didn’t like it” instead of, “I didn’t want it”. And, “do you want chocolate?” would be met with, “no, I don’t like it”, but “did you want chocolate?” would be met with, “no, I didn’t want it”, and that second one would just mean what it straightforwardly sounds like in English.
(Strictly speaking, it doesn’t need to be a response to a question, I’m just putting it into a context to show that the verb used in the answer isn’t just a negative form of the same verb used in the question.)
(It’s hard to explain because if you were to translate this literalistically to English, it wouldn’t even be noticed, since saying, “no, I don’t like it” in response to, “do you want it?” is quite natural, but does literally just mean, “I don’t like it”, in the sense of, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”. Even, “no, I don’t want it“ in response to, “do you like it?” is fairly natural in English, if a little presumptive-sounding.)
In Polish when someone asks you “Czy chcesz cukru do kawy?” (“Do you want coffee with sugar?”) and you can respond with “Dziękuję”, which can mean 2 opposite things “Yes, please” or “No, thank you”.
The first one gets a pass because it was the first one, and even then, I think it’s better to link people one of the many explainers people wrote about it.
I’m a layperson on this topic, but here’s my best understanding of what’s going on.
Take the function unique :: List[Int] -> Bool. There’s an obvious O(n) time algorithm: iterate through the list, throw everything into a hash table, return false if a number appears twice. This also requires (auxiliary) O(n) space, with the worst case being “the only duplicate is in the last element”.
There’s another algorithm you could do: sort the list and then compare adjacent elements. This requires O(nlog(n)) time but only O(1) auxiliary space (for the sort), meaning the “unique problem” has a more spatially efficient solution than it has a temporally efficient solution.
(And here’s where I think I’m already wrong: the two solutions have different auxiliary space usage, but they both have the same overall space complexity O(n), as you need already need that much space to store the whole input! I just don’t know any good examples of total space being more efficient, having thought about this for all of ten minutes).
It’s been known for a long time that this is true in the general case: any problem’s certain time complexity is “larger” than its space complexity. The best bound since 1977 is “not that much more efficiently”: O(f(n)) time can be solved in O(f(n)/log(f(n))) space. So, like O(n²) can be solved in O(n²/log(n)) space, which grows almost as fast as n² anyway.
This paper claims that O(f(n)) time can be solved in O(sqrt(f(n)log(f(n))) space, so O(n²) time can be solved in O(n*sqrt(log(n))) space, which is an enormous improvement. The blog says that if paper passes peer review, it’ll be a major step towards resolving P ≠ PSPACE!
Anyone who’s an actual complexity theorist should please weigh in, I have no idea how much of my understanding is garbage
A lazy space-efficient solution for unique is just O(n^2) compare each element to every other element.
You might be able to get away with the sort if you have some sub-linear auxiliary structure (permutation index?) that lets you un-sort the list when done (if I understand catalytic computing right, which I likely don’t). Edit: never mind, log(n!) to even count the permutations is going to have n outside the log anyway.
Iirc from the lectures I attended a couple months ago, when it comes to log-space or at least less than linear, we treat the input tape as read only, the output tape as write (and maybe also append) only. The Auxiliary space is all you measure, but that readonly-ness is what breaks your sorting solution, because sorting requires mutation or copying. (I’m sure there are other ways of formalising it)
The simple log-space solution would be the brute force compare that @hcs mentioned, I think you would use one pointer/index to each of the two elements you are currently comparing which would take log(n) each.
I mentioned an output tape as the third tape because I was remembering log-space reductions (transforming the input from one problem to another using only log auxiliary space).
The auxiliary space is read+write. So for sublinear space machines you have read only input, sublinear auxiliary space, and the machine either accepts, rejects, or diverges.
In the log space reduction/transducer case I was remembering, you also then output a transformed result into an write only output tape, the key being you can’t store data there (and get around the aux space restriction), only append more output.
Another interesting thing (also pointed out on the wiki linked below) is that you can’t do O(n + log n) read-write input tape, because linear working space can be recovered using the linear speedup theorem (I always think of it as SIMD for Turing machines. You can transform four input binary bits to one hex symbol for example, and make you input 4x smaller at the cost of a bigger state machine).
Some other bits:
Formally, the Turing machine has two tapes, one of which encodes the input and can only be read, whereas the other tape has logarithmic size but can be written as well as read. Logarithmic space is sufficient to hold a constant number of pointers into the input and a logarithmic number of Boolean flags, and many basic logspace algorithms use the memory in this way.
Radix sort is O(n) if the elements of the array have a limited size. More precisely it’s O(nk) where k is the element size, which for integer elements is k = log m where m is the integer value of the largest element.
Edit: and the question of the existence of duplicate elements is only interesting if m > n (otherwise it’s trivially true) so in this context radix sort can’t do better than O(n log n).
It seems that today, with all of our tests, we allow the code to grow woolly and complicated in ways that, in the past, with no automated regression tests, we never could have tolerated. We were forced to keep the code simple because if we didn’t, the complexity of the special cases wouldn’t fit in our heads.
We were also forced to keep code simple because the computers of the past had much less memory and disk space than today. My first computer came with 16K of RAM (16,384 bytes). That’s not much for a program of any complexity. Even 64K (65,536 bytes) isn’t that much in the grand scheme of things (if, on average, an instruction takes 2 bytes, then a program can only ever have 32,767 instructions, with no space left over for any data—the editor I use, not large by today’s standards, has over 116,000 instructions and takes over 350,000 bytes just for the instructions).
Also, computers were a lot slower back in the day. I recall the breathless announcements when computers hit 33MHz. If you think your tests are slow today, they would have been glacial back then.
All this to say, we were forced to keep the code simple because the computers at the time were simpler.
I don’t think the correlation is so clear-cut. When we had 16k of RAM, we had to do manual memory management, didn’t we? Isn’t being able to write list.append("∀βcd∃") simpler than having to implement a linked-list for a char[] (and figuring out how to represent non-ASCII characters)?
True, somethings are easier now, like spell checking used to be a major engineering feat and now it’s just a few lines. Supporting non-ASCII is either easy (because of UTF-8) or insanely complex depending on what we want to do with said characters. It used to be one could use a simple text editor [1] for programming, and now we expect way more from our programming environments than what could be done in the past.
Just how much code resides behind the simple list.append() and network.connect() calls?
[1] My first real editor I used was 40k in size. It was programmable. It could handle (very slowly) files larger than memory. And it was version 1.0. Limitations? Sure—lines were limited to 255 characters, and had to end with CRLF. But I never did find a bug in it.
You must reverse-engineer the existing code, understanding not just what it does but why it does it.
I find one way to preserve intellectual control over code is to write detailed comments that explain the overall design, why it was chosen, implications, etc. And this applies both in the large (overall system/module design) but also in the small (a function implementation, etc). These days its not uncommon for the source files that I write to be 50% comments. While this requires effort and discipline, the result is invaluable when you revisit the code after a couple of years break and have only the fizziest recollection of what might be going on.
One trick I have here is forcing myself to write file-level comments. Function-level comments do not always help: they explain what the function does, sometimes tell you why it exists, but rarely discuss whether it should exist in the first place. When commenting any specific fragment of code, the mind-state naturally gravitates to describing the code, not the context.
At the file level, you don’t have any immediate code to attach to, so it’s usually easier to trick yourself into writing about the surrounding context and high-level purpose, rather than the low-level minutia.
Interesting. I do sometimes do this to provide module-level comments in Haddocks, but don’t routinely go to this level. I have recently also become a big fan of GHC-style notes as a way to have cross-referenceable pieces of documentation that aren’t a function- or file-level comment.
I’ve also gravitated to writing lots of file-level comments without explicitly realize that’s what I’ve been doing. You should write a blogpost about it!
I feel like a lot of Meyer’s stuff follows a pattern: he denigrates other people’s approaches to problems because they’re not his approach, and his approach is best by the virtue of being his approach. One example from this paper:
Consider first Gilles Kahn’s classic “Natural Semantics” paper [19]. It specifies a simple programming lan-
guage and has a list of definitions for programming constructs, assuming a considerable mathematical baggage.
The semantics of the conditional expression (“if … then … else …”), for example, is specified through two
inference rules: [cut]
One can only shake one’s head: the notion of conditional expression can be explained in full to a 6-year-old: wherever E1 holds use E2 , otherwise use E3 . You may summon Fig. 4 for help.
Meyer mocks natural semantics without ever explaining the actual point of it: as a basis for tooling. Sure, conditional expression is intuitive to a 6-year-old, but I can’t force a 6-year-old inside my type checker!
This is especially ironic, because here’s Meyer’s later definition of the “conditional expression” if C then p else q:
if C then p else q end
->
if C: p, C’: q end
->
(C: p) ∪ (C': q)
->
<post_p / C, Pre_p ∩ C> ∪ <post_q / C', Pre_q ∩ C'>
->
<⌊post_p / C⌋ ∪ ⌊post_q / C'⌋ , (Pre_p ∩ C) ∪ (Pre_q ∩ C')>
Where C’ = “the complement of C” (not defined, but later indicated to be “¬C”), ⌊p⌋ = “Rounded version of p” ⟨post_p / Pre_p, Pre_p⟩, r / X is the set {[x, y] ∈ R | x ∈ X]. I’m not fully clear on what it means to “round” a postcondition; he only defines “rounding” a whole program.
Now compare this to what was so bad one could only “shake one’s head”:
but yeah, rounding is defined in the article only for programs, not for postconditions.
I’m skeptical about what this article contributes to the field (has this really not been done before?), but I could believe that building a theory of computation directly on set theory would be convenient for reusing set theory ideas and tools in computation theory.
Nice. I have a few folders somewhere called 2024/and 2025/ inside are files 01.md etc. At the start of a new month I open up the right one and type :r !cal and then each day I type <leader> t to insert today’s date before writing something down so it’s very easy for me to organize things under headings like:
# 20250224
todo call that guy
todo compliance training
I open this file by adding a line to my bashrc: alias jn='vim ~/Documents/$(date +"%Y/%m").md' so I don’t have to remember the month or year.
I use these files to take notes which I never look at.
Very much agree but id say that this current AI cycle is specifically GenAI, to distinguish it from other AI hypes that I suspect will happen in the future.
the events are almost completely unstructured, i.e. you can’t add arbitrary content without defining a format on top of it
requires lots of manual maintenance
adding an event spanning multiple days requires adding it to every single day
same for recurring events
events don’t have an id so deleting/modifying a previously added one requires finding all the other instances of the same event over multiple days/weeks/months by guesstimating if something is the same event or not
the week in every line seems of questionable utility
not a standard format so no existing email/calendar software can use it
many new and exciting ways to fuck up your schedule with a single typo that no other calendar software has ever provided
some of these can be addressed with tooling, but then wouldn’t you just rather use something without all the limitations?
Grepability, maybe? With one day per line, searching for an event always gets the date of the event, too. (You could copy the event date for each line, but maybe that has some other problem?)
chicken-egg problem now. Who would want to use a QR code encoding that won’t work with the majority of QR code readers for only a very small gain, and how many reader implementations are actively maintained and will add support for something nobody uses yet?
The “byte” encoding works well for that. Don’t forget, URls can contain the full range of Unicode, so a restricted set is never going to please everyone. Given that QR codes can contain ~4k symbols, I’m not sure there’s much need for an entirely new encoding.
Yes, although… there’s some benefit to making QR codes smaller even when nowhere near the limits. Smaller ones scan faster and more reliably. I find it difficult to get a 1kB QR code to scan at all with a phone.
QR codes also let you configure the amount of error correction. For small amounts of data, you often turn up the error correction which makes them possible to scan with a very poor image, so they can often scan while the camera is still trying to focus.
URIs can contain the full Unicode range, but the average URL does not. Given that’s become a major use case for qrcodes it’s definitely a shame it does not have a better mode for them: binary mode needs a byte per byte while alnum only needs 5.5 bits per byte.
All non-ascii characters in URLs can be %-encoded so unicode isn’t a problem. A 6-bit code has room for lower case, digits, and all the URL punctuation, plus room to spare for space and an upper-case shift and a few more.
So ideally a QR encoder should ask the user whether the text is a URL, and should check which encoding is the smallest (alnum with %-encoding, or binary mode). Of course this assumes that any paths in the URL are also case-insensitive (which depends on the server).
Btw. can all HTTP servers correctly handle requests with uppercase domain names? I’m thinking about SNI, maybe certificate entries…? Or do browsers already “normalize” the domain name part of a URL into lowercase?
The spec says host names are case-insensitive. In practice I believe all (?) browsers normalize to lowercase so I’m not sure if all servers would handle it correctly but a lot certainly do. I just checked and curl does not normalize, so it would be easy to test a particular server that way.
Host name yes, but not path. So if you are making a URL that includes a path the path should be upper case (or case insensitive but encoded as upper case for the QR code).
No, a URI consists of ASCII characters only. A particular URI scheme may define how non-ASCII characters are encoded as ASCII, e.g. via percent-encoding their UTF-8 bytes.
Byte encoding is fun in practice, as I recently discovered, because Android’s default QR code API returns the result as a Java String. But aren’t Java Strings UTF-16? Why yes, and so the byte data is interpreted as UTF-8, then converted to UTF-16, and then provided as a string.
The work around, apparently, if you want raw byte data, is to use an undocumented setting to tell the API that the data is in an 8-bit code page that can be safely round tripped through Unicode, and then extract the data from the string by exporting it in that encoding.
I read somewhere that the EU’s COVID passes used text mode with base45 encoding because of this tendency for byte mode QR codes to be interpreted as UTF-8.
Welcome to lobsters! Just a heads up, we have a community norm here that posters should limit self promotion (submitting things they worked on or wrote) to no more than a quarter of posts and comments. Are there any interesting Ada articles you’ve run into that are worth submitting?
No it’s fine! It’s just that a lot of people who come in don’t realize that lobsters has a much stronger norm about this than other places. One post isn’t a problem, two isn’t, five posts is. So we try to let people know this early so they’re aware.
I like Alloy, especially after they added the new temporal features in version 6.
Last time I played with it, it had to be used from inside some awful IDE though. I know this is a common complaint, my point is not to repeat it. What I wanted to say is that while I was browsing the download page for Alloy just now, I noticed that it links to Sterling: A Web-based Visualizer for Relational Modeling Languages. I was hoping it would allow me to use my own editor and do custom visualisations in a browser, this doesn’t seem to be the case, rather it’s a new web-based IDE.
However, as I was skimming through the Sterling paper, I noticed that it said that Sterling also “is the visualizer for an Alloy-like model finder called Forge, which is being developed at Brown University and is used to teach a Logic for Systems class of over 60 students”. So I looked up Forge, it appears to be a Racket “language”/eDSL which looks like Alloy. I’m still not sure what the exact difference is, their readme says “Forge is heavily adapted from the excellent Alloy, a more widely used and somewhat more scalable tool. Forge and Alloy even use the same engines!”, but also “For broader historical context on Forge, the Alloy documentation or an Alloy demo may be useful. We don’t suggest using these as reference for Forge itself, however.”. In either case, it seems Forge might be a way to get a taste of Alloy without the sour taste of their IDE!
Hi, I’m one of the authors of Alloy 6 and of this new book.
I think the image of the Alloy Analyzer your depicting is a bit too dark. The IDE is quite sophisticated since it features an interconnection of a solving engine, a rich visualizer as well as a very useful evaluator (to navigate instances and counterexamples). Compared with other formal methods, I think Alloy is quite good in terms of user-friendliness and usefulness of results it provides. The legacy integrated editor is on the other hand quite basic but, considering models in Alloy are rather small, the -real- annoyances remain limited in my view (the editor has syntax highlighting, error reporting, find&replace…). An interesting aspect for non-power users is that this is all delivered in a single, ready-to-run JAR file. This is in particular invaluable with students (Alloy is taught in several universities, at least in Europe and in America).
A future version will allow users to rely on LSP and use an editor of their choice (there were already some experiments with LSP and there actually is an unofficial VSCode plugin but it’s not up to date). As an “epxert” user myself, what I would not want, however, would be to lose the the above-mentioned sophisticated interactions present in the tool, for the sake of getting a slicker editing experience.
Alloy 6.2 now has a command line but it’s not that great for solving, more for automated build chains. Many serious Alloy users I know use the Vscode extension.
The new A6.2 command line indeed allows to execute the same kinds commands than in the GUI, with roughly the same options, so it’s mainly aimed at facilitating batch verification and benchmarks, I’d say.
For people who do not know the Alloy Analyzer completely, this is different from the Evaluator feature mentioned above, which allows to query, in Alloy itself, an instance or counter-example currently being shown by the Visualizer. This feature, which isn’t new by the way, is very useful in practice when models become complex.
Finally, the VSCode extension does not use Alloy analyzer 6.1 nor 6.2 but it ships a tailored version, AFAIK. It’s indeed already great but, hopefully, in the future, we will be able to get a plugin that can connect to the “real”, official Alloy.
Is this the first big Go project at Microsoft? I had no idea they could choose Go, I would assume that it just wouldn’t be acceptable, because they wouldn’t want to unnecessarily legitimize a Google project.
Their web browser runs on Chromium, I think it’s a little late to avoid legitimizing Google projects
They had already lost the web platform by the time they switched to Chromium.
When I think of MS and ‘developers developers developers develop’, I was lead to believe that they’d have more pride in their own development platforms.
I wonder if the switch to Chromium, the embrace of Linux in Azure, the move from VS to VSCode, any of these, would have happened under Ballmer or Gates. I suspect Gates’ new book is like Merkel’s: retrospective but avoiding the most controversial questions: Russian gas and giving up dogfooding. Am I foolish to expect a more critical introspection?
I think corporate open source is now “proven”, and companies don’t worry so much about this kind of thing.
Google has used TypeScript for quite awhile now (even though they had their own Closure compiler, which was not staffed consistently, internally)
All of these companies are best thought of as “frenemies” :) They cooperate in some ways, and compete in others.
They have many common interests, and collaborate on say lobbying the US govt
When Google was small, and Microsoft was big, the story was different. They even had different employee bases. But now they are basically peers, and employees go freely back and forth
Microsoft has actually maintained a security-hardened build of Go for a while now: https://devblogs.microsoft.com/go/
I can see a blog post saying they made changes for FIPS compliance, which is usually understood to be for satisfying government requirements and not a security improvement. I can’t see a summary of any other changes.
MS is a pretty big company, so I wouldn’t be surprised if they have a ton of internal services in Go. Plus with Azure being a big part of their business, I doubt they care what language people use as long as you deploy it on their platforms.
Can confirm that some internal services were written in go as of a couple years ago. (I worked for Microsoft at the time.) I didn’t have a wide enough view to know if it was “a ton,” though.
We embrace and we extend!
I’m too lazy to look up a source but some of the error messages I’ve seen in azure make it clear Microsoft uses go to build their cloud.
Message like this one:
Keep in mind Microsoft also maintains their own go fork for fips compliance. I don’t know exactly their use case but I can tell you at Google we did the same thing for the same reason. I think the main fips compliant go binaries being shipped were probably gke/k8s related.
(Edit removed off topic ramble, sorry I have ADHD)
Could this message just be coming from Kubernetes?
All of the demos I’ve seen so far of “vibe coding” have all been videogames. In games there’s more wiggle room for being “almost correct”: the game’s still playable if your sprite is bigger than you want. How well does vibe coding work on a large, complex task with stricter requirements?
(By “vibe coding”, I’m sharing Simon’s definition of not even looking at the code. Using it as an augment doesn’t count!)
It occurred to me that in my view, “software development” is for products I’m most aware of: database backed line-of-business applications with a ton of hooks for APIs, reporting, mobile views etc.
But for a lot of people, “software development” probably means: create a mobile app and hope it goes viral on an app store. Very different lifetime and support requirements, as well as different revenue streams.
Whether GenAI is appropriate for either is outside my expertise.
asks for you email address for the code? no thanks
I’m so tired of the constant begging to sign up for mailing lists, etc. exhausting.
Oh man I didn’t even notice the “unsubscribe at any time”. Does asking for the code automatically put you on his newsletter? That’s a dark pattern alright.
When it got posted elsewhere recently, someone mirrored the code on github:
https://github.com/jwenjian/scroll-buddy
Having to constantly be alert to dark patterns like this is getting very tiresome, indeed.
Doesn’t seem to work on Windows firefox
works on windows+firefox here
Nor Linux FF
I think the answer to “is this something totally unique to software engineering?” is almost always “no.” One example: ever notice how in lots of public libraries, the programming section is next to the astrology section? It’s because the Dewey Decimal system didn’t have any room in the technology block for new topics, so everything computer science had to be shoved in the miscellaneous section (next to paranormal). Libraries have to live with January 1876 for the rest of their lives!
Dupe of https://lobste.rs/s/dw09hf/ai_where_loop_should_humans_go
This is the one on the guys personal blog, so maybe this should be the primary?
It’s interesting to me that they never played Slay the Spire until 2023! It’s my favorite game of all time and I consider it largely foundational to the genre of modern deck-building roguelikes.
Really interesting perspective and way to approach game development! Ignore the herd and do your own thing, then go back and explore the genre. Kinda the inverse of what I expected.
That’s what leapt out to me too. If he played other games from the start, Balatro wouldn’t be nearly so unique. If he hadn’t played other games at the end, it wouldn’t have had the stake system, which makes the game so much more replayable.
Research makes sense for exploring the design space for solving a specific design problem (i.e. “I know I want a mechanic/dynamic for X. What mechanics have other designers used for this?”), which should really only happen after you’ve gotten through the first steps of, “What experience am I trying to make? What mechanics would I need to generate that experience?” It’s how I prefer to do my game design as well, and I’m much happier with the systems I’ve made as a result. You get more novel ideas that way.
Doing things in reverse is how you get a preponderance of clones (Hollow Knight, for instance, spawning a zillion Metroidvanias that share certain key traits with HK) where people want to make a game that’s Just Like The Thing I Like. They take a package of mechanics based on something they like and flex one or two of them. It makes something slightly new, but not wholly new, and ultimately it feels very same-y to the source material.
This is fun! Up until now the only succinct data I’d heard of was the “succinct circuit”. Imagine we have a 10 node graph where all edges exist (“complete”) except the edge
1->2. You could describe it in a conventional way (adjacency list, edge dict, matrix) or you could describe it as a function:Succinct circuits can only represent a subset of graphs, but they use a fraction of the space. After all, they’re succinct!
Here’s the really interesting bit to me: checking if an edge exists is ~O(1) in a matrix. In a circuit, it’s P-complete. And this “lifts” the complexity of graph algorithms by one tier. P-complete problems become EXPTIME-complete, NP-complete problems become NEXPTIME-complete. I find that super cool!
A good “falsehoods” list needs to include specific examples of every falsehood.
Yours doesn’t! And I maintain that it’s still a good
boylist.That doesn’t look to me like it’s meant to be an example of a good falsehoods list.
My point stands. Dogs.
In addition, it’s worth knowing that dogs up to 2 years of age exhibit the halting problem.
I’ll make an attempt, with the caveat that this list seems so obvious to me that I’m worried I might be missing some nuance (imagine a similar list about cooking utensils with “people think knives can only be used for butter, but in reality they can also be used to cut bread, meat, and even vegetables!!!”).
Both the substitutions and the surrounding text can depend on each other. The obvious example is languages where nouns have gender, but you might also have cases like Japanese where “in” might be へ, で, or に to indicate relative precision of the location.
German is the classic example of using lengthy compound words where English would use a shorter single-purpose word, “Rindfleisch” vs “beef” or “Lebensmittel” vs “food” (why yes I haven’t had lunch yet, why do you ask…?).
See above – English -> German tends to become longer, English -> Chinese tends to become shorter.
Turkish and German are famous counter-examples, with Turkish
'i'/'I'being different letters, or German ß capitalizing to"SS"(though I think this is now considered somewhat old-fashioned?).Not true in Chinese, Japanese, Korean.
Every language has words that don’t exist in any other language. Sometimes because the concept is alien (English has no native word for 寿司), sometimes because a general concept has been subdivided in a different way (English has many words for overcast misty weather that don’t translate easily into languages from drier climates).
I’m not sure what this means because many expressions in English don’t even have a single explanation in English, but in any case, idioms and double entendres often can’t be translated directly.
If one’s English to SVO order is limited, limited too must their knowledge of literature be.
Even English doesn’t follow a rule of capitalizing the first character of every word. Title Casing The First Letter Of Every Word Is Bad Style.
One well-known counter-example being languages where agreement is by repeating a verb:
A: “Do you want to eat lunch together?” B: “Eat.”
See above.
Color / colour, aluminum / aluminium
Not sure exactly what this means – upper-case vs lower-case? Latin vs Cyrillic? 漢字 vs ひらがな カタカナ ? 简化字 vs 繁体字 ? Lots of counter-examples to choose from, Kazakh probably being a good one.
Lithuanian sorts
'y'between'i'and'j': https://stackoverflow.com/questions/14458314/letter-y-comes-after-i-when-sorting-alphabeticallySome languages special-case ordering of letter combinations, such as
ijin Dutch.And then there’s the dozens of European languages that have their own letters outside the standard 26. Or diacritics.
Arabic, Hebrew.
Not sure what “flows” means here, but applications with good RtL support usually flip the entire UI – for example a navigational menu that’s on the right in English would be on the left in Arabic.
Chinese, Japanese.
English:
"Dear Mr. Smith".French famously has rules that differ from English regarding spacing around punctuation.
“ ” in English,「 」in Japanese, « » in French,
European languages that use
'.'for thousands separator and','for the fractional separator, or languages that group by different sizes (like lakh/crore in Indian languages).Many languages are considered distinct for political reasons, even if a purely linguistic analysis would consider them the same language.
English (as spoken in Pittsburgh), English (as spoken in Melbourne), and English (as spoken in Glasgow).
More seriously, Japanese and Javanese.
Often they’re difficult to understand even for English speakers (I once saw a literal hamburger used to signify a collapsable sidebar).
Nobody who has ever travelled would think this. And yet. AND YET!
C’mon Google, I know that my IP is an airport in Warsaw but I really don’t want the Maps UI to switch to Polish when I’m trying to find a route to my hotel.
You can roughly gauge where you are in the world by whether the local ATMs offer “🇬🇧 English”, “🇺🇸 English”, or “🇦🇺 English”.
Belgium, Luxembourg, Switzerland.
English, again.
The German ß has history.
The old rule is that ß simply has no uppercase. Capitalizing it as “SS” was the default fallback rule if you had to absolutely capitalize everything and the ß would look bad (such as writing “STRAßE” => “STRASSE”). Using “SZ” was also allowed in some cases.
The new rule is to use the uppercase ß: ẞ. So instead of “STRASSE” you now write “STRAẞE”.
The usage of “SZ” was disallowed in 2006, the East Germans had an uppercase ß since 1957, the West German rules basically said “Uppercase ß is in development” and that was doppred in 1984 for the rule to use SS or SZ as uppercase variant. The new uppercase ß is in the rules since 2017. And since 2024 the uppercase ß is now preferred over SS.
The ISO DIN 5008 was updated in 2020,
This means depending on what document you’re processing, based on when it was created and WHERE it was created, it’s writing of the uppercase ß may be radically different.
It should also be noted that if you’re in Switzerland, ß is not used at all, here the SS substitute is used even in lower case.
Family names may also have custom capitalization rules, where ß can be replaced by SS, SZ, ẞ or even HS, so “Großman” can become “GROHSMANN”. Note that this depends on the person, while Brother Großmann may write “GROHSMANN”, Sister Großmann may write “GROSSMANN” and their mother may use “GROẞMANN” and these are all valid and equivalent.
Umlauts may also be uppercased without the diacritic umlaut and with an E suffix; ä becomes “AE”. In some cases even lowercase input does the translation because older systems can’t handle special characters, though this is not GDPR compliant.
If you ever want to have fun, the politics and regionality of German dialects could be enough to drive some linguists up the wall.
Bavarian is recognized as a language and dialect at the same time, it can be subdivided into dozens and dozens of subdialects, which are all similar but may struggle to understand eachother.
As someone who grew up in Swabian Bavaria, my dialect is a mix of both Swabian and Bavarian, I struggle to understand Northern Bavaria but I struggle much less with Basel Swiss Germany (which is distinct from Swiss German in that it originates from Lower Allemans instead of Higher Allemans) which is quite close in a lot of ways.
And the swiss then double down on making things confusing by sometimes using french language constructs in german words, or straight up importing french or italian words.
What should I read to learn more about this? Why wasn’t the character in Unicode 1.0, then?
East Germany added the uppercase ß in 1957 and removed it in 1984. The spelling rules weren’t updated, so despite the presence of an uppercase ß, it would have been wrong to use it in any circumstances. Since Unicode 1.0 is somewhere around 1992, with some early drafts in 1988, it basically missed the uppercase ß being in the dictionary.
The uppercase ß itself has been around since 1905 and we’ve tried to get it into Unicode since roughly 2004.
Is this more like there being an attested occurrence in a particular dictionary in East Germany in 1957 rather than common usage in East Germany?
A good example of this is a CMS I used to work on. The way it implemented translation was to define everything using English[0], then write translations as a mapping from those English snippets to the intended language. This is fundamentally flawed, e.g. by homonyms:
Here, the “Read” flag means “this has been read”, whilst the “Read” button means “I want to read this”. Using the English as a key forces the same translation on both.
[0] We used British English, except for the word “color”; since we felt it was better to match the CSS keywords (e.g. when defining themes, etc.).
One trick is to use a different word on the asset: Reviewed(adj) and Review(v) don’t have the same problem that Read(adj) and Read(v) do. Seen(adj) and See(v); Viewed(adj) and View(v). And so on. Then you can “translate” to English to actually use Unread/Read/[Read] if you still like it without confusing the translator who need to know you want more like e.g. Lido/Ler or 阅读/显示 and so on.
Much better than the original article. Also love how many of the counter examples come from English.
My bar for these lists is https://yourcalendricalfallacyis.com/ and most “falsehoods programmers believe” lists don’t meet it.
The number of exceptions caused by the Hebrew calendar makes me shed a tear of joy.
Here’s one falsehood they missed: the length of a year varies by at most one day. True in Gregorian calendar, apparently true in the Islamic calendar, but not true in the Hebrew calendar: leap years are 30 days longer than regular years.
They sorta cover it on the “days” section, by way of mentioning that the Hebrew calendar has leap months.
They also miss Byzantine calendars which are still used by many churches, related to the Jewish Greek calendar from the Septuagint. It’s of course complicated by the fact that many churches & groups do not agree on what year was the start, so it’s complex to use (but still in somewhat fairly common liturgical use).
Wow 30? I need to red more about this
Here’s a fun (counter)example of (something like) this one from my heritage language:
(Context: the word for enjoy/like is the same in the language, so when translating to English, I choose whichever sounds most natural in each given example sentence.)
When someone says, “do you (enjoy/)like it?”, if you want to say “yes, I like it”, that’s fine, but if you want to say you don’t like it, you would say, “I don’t want it”; if you were to say, “I don’t like it” in that situation, it would mean, “I don’t want it”. The same reversal happens if they ask, “do you want it?”, and you want to respond in the negative.
So someone would say, “do you want a chocolate bar?”, and you’d say, “no, I don’t want it”, and that would mean, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”, whereas, “no, I don’t enjoy it” would just straightforwardly mean, “I don’t want it”.
(You can also respond with just, “no!” instead of using a verb in the answer.)
This only happens in the habitual present form. Someone might ask, “do you like chocolate?” before they offer you some, and you can say, “no, I don’t want it”, but if they already gave you a chocolate bar to try, they may ask, “did you like it?” in the past tense, and you’d have to respond with, “I didn’t like it” instead of, “I didn’t want it”. And, “do you want chocolate?” would be met with, “no, I don’t like it”, but “did you want chocolate?” would be met with, “no, I didn’t want it”, and that second one would just mean what it straightforwardly sounds like in English.
(Strictly speaking, it doesn’t need to be a response to a question, I’m just putting it into a context to show that the verb used in the answer isn’t just a negative form of the same verb used in the question.)
(It’s hard to explain because if you were to translate this literalistically to English, it wouldn’t even be noticed, since saying, “no, I don’t like it” in response to, “do you want it?” is quite natural, but does literally just mean, “I don’t like it”, in the sense of, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”. Even, “no, I don’t want it“ in response to, “do you like it?” is fairly natural in English, if a little presumptive-sounding.)
In Polish when someone asks you “Czy chcesz cukru do kawy?” (“Do you want coffee with sugar?”) and you can respond with “Dziękuję”, which can mean 2 opposite things “Yes, please” or “No, thank you”.
The original ones, like “…Names”, don’t; part of what I find fun about them is trying to think of counterexamples.
I think if you want them to be useful they need to include counterexamples. If it’s just a vent post then it’s fine to leave them.
The first one gets a pass because it was the first one, and even then, I think it’s better to link people one of the many explainers people wrote about it.
I’m a layperson on this topic, but here’s my best understanding of what’s going on.
Take the function
unique :: List[Int] -> Bool. There’s an obvious O(n) time algorithm: iterate through the list, throw everything into a hash table, return false if a number appears twice. This also requires (auxiliary) O(n) space, with the worst case being “the only duplicate is in the last element”.There’s another algorithm you could do: sort the list and then compare adjacent elements. This requires O(nlog(n)) time but only O(1) auxiliary space (for the sort), meaning the “unique problem” has a more spatially efficient solution than it has a temporally efficient solution.
(And here’s where I think I’m already wrong: the two solutions have different auxiliary space usage, but they both have the same overall space complexity O(n), as you need already need that much space to store the whole input! I just don’t know any good examples of total space being more efficient, having thought about this for all of ten minutes).
It’s been known for a long time that this is true in the general case: any problem’s certain time complexity is “larger” than its space complexity. The best bound since 1977 is “not that much more efficiently”: O(f(n)) time can be solved in O(f(n)/log(f(n))) space. So, like O(n²) can be solved in O(n²/log(n)) space, which grows almost as fast as n² anyway.
This paper claims that O(f(n)) time can be solved in O(sqrt(f(n)log(f(n))) space, so O(n²) time can be solved in O(n*sqrt(log(n))) space, which is an enormous improvement. The blog says that if paper passes peer review, it’ll be a major step towards resolving P ≠ PSPACE!
Anyone who’s an actual complexity theorist should please weigh in, I have no idea how much of my understanding is garbage
A lazy space-efficient solution for unique is just O(n^2) compare each element to every other element.
You might be able to get away with the sort if you have some sub-linear auxiliary structure (permutation index?) that lets you un-sort the list when done (if I understand catalytic computing right, which I likely don’t). Edit: never mind, log(n!) to even count the permutations is going to have n outside the log anyway.
Iirc from the lectures I attended a couple months ago, when it comes to log-space or at least less than linear, we treat the input tape as read only, the output tape as write (and maybe also append) only. The Auxiliary space is all you measure, but that readonly-ness is what breaks your sorting solution, because sorting requires mutation or copying. (I’m sure there are other ways of formalising it)
The simple log-space solution would be the brute force compare that @hcs mentioned, I think you would use one pointer/index to each of the two elements you are currently comparing which would take
log(n)each.Ah, that makes so much now. Thank you!
Edit: when you say “write only”, does that mean you can’t write some intermediate result to the aux space and then read it back later?
I mentioned an output tape as the third tape because I was remembering log-space reductions (transforming the input from one problem to another using only log auxiliary space).
The auxiliary space is read+write. So for sublinear space machines you have read only input, sublinear auxiliary space, and the machine either accepts, rejects, or diverges.
In the log space reduction/transducer case I was remembering, you also then output a transformed result into an write only output tape, the key being you can’t store data there (and get around the aux space restriction), only append more output.
Another interesting thing (also pointed out on the wiki linked below) is that you can’t do
O(n + log n)read-write input tape, because linear working space can be recovered using the linear speedup theorem (I always think of it as SIMD for Turing machines. You can transform four input binary bits to one hex symbol for example, and make you input 4x smaller at the cost of a bigger state machine).Some other bits:
https://en.wikipedia.org/wiki/L_(complexity)
And the text book referenced by wikipedia and my lecturer: Sipser 1997 https://fuuu.be/polytech/INFOF408/Introduction-To-The-Theory-Of-Computation-Michael-Sipser.pdf The definition of log space machine is on page 321 (342 of the pdf).
I think they have the same time complexity as well, since integer sort is O(n) (with any lexicographic sort like radix sort).
Radix sort is O(n) if the elements of the array have a limited size. More precisely it’s O(nk) where k is the element size, which for integer elements is k = log m where m is the integer value of the largest element.
Edit: and the question of the existence of duplicate elements is only interesting if m > n (otherwise it’s trivially true) so in this context radix sort can’t do better than O(n log n).
2015-2020 had a minor “serverless” bubble.
We were also forced to keep code simple because the computers of the past had much less memory and disk space than today. My first computer came with 16K of RAM (16,384 bytes). That’s not much for a program of any complexity. Even 64K (65,536 bytes) isn’t that much in the grand scheme of things (if, on average, an instruction takes 2 bytes, then a program can only ever have 32,767 instructions, with no space left over for any data—the editor I use, not large by today’s standards, has over 116,000 instructions and takes over 350,000 bytes just for the instructions).
Also, computers were a lot slower back in the day. I recall the breathless announcements when computers hit 33MHz. If you think your tests are slow today, they would have been glacial back then.
All this to say, we were forced to keep the code simple because the computers at the time were simpler.
I don’t think the correlation is so clear-cut. When we had 16k of RAM, we had to do manual memory management, didn’t we? Isn’t being able to write
list.append("∀βcd∃")simpler than having to implement a linked-list for achar[](and figuring out how to represent non-ASCII characters)?True, somethings are easier now, like spell checking used to be a major engineering feat and now it’s just a few lines. Supporting non-ASCII is either easy (because of UTF-8) or insanely complex depending on what we want to do with said characters. It used to be one could use a simple text editor [1] for programming, and now we expect way more from our programming environments than what could be done in the past.
Just how much code resides behind the simple
list.append()andnetwork.connect()calls?[1] My first real editor I used was 40k in size. It was programmable. It could handle (very slowly) files larger than memory. And it was version 1.0. Limitations? Sure—lines were limited to 255 characters, and had to end with CRLF. But I never did find a bug in it.
Nice article.
I find one way to preserve intellectual control over code is to write detailed comments that explain the overall design, why it was chosen, implications, etc. And this applies both in the large (overall system/module design) but also in the small (a function implementation, etc). These days its not uncommon for the source files that I write to be 50% comments. While this requires effort and discipline, the result is invaluable when you revisit the code after a couple of years break and have only the fizziest recollection of what might be going on.
One trick I have here is forcing myself to write file-level comments. Function-level comments do not always help: they explain what the function does, sometimes tell you why it exists, but rarely discuss whether it should exist in the first place. When commenting any specific fragment of code, the mind-state naturally gravitates to describing the code, not the context.
At the file level, you don’t have any immediate code to attach to, so it’s usually easier to trick yourself into writing about the surrounding context and high-level purpose, rather than the low-level minutia.
Interesting. I do sometimes do this to provide module-level comments in Haddocks, but don’t routinely go to this level. I have recently also become a big fan of GHC-style notes as a way to have cross-referenceable pieces of documentation that aren’t a function- or file-level comment.
In OpenZFS we call these “grand theory” comments.
I’ve also gravitated to writing lots of file-level comments without explicitly realize that’s what I’ve been doing. You should write a blogpost about it!
I feel like a lot of Meyer’s stuff follows a pattern: he denigrates other people’s approaches to problems because they’re not his approach, and his approach is best by the virtue of being his approach. One example from this paper:
Meyer mocks natural semantics without ever explaining the actual point of it: as a basis for tooling. Sure, conditional expression is intuitive to a 6-year-old, but I can’t force a 6-year-old inside my type checker!
This is especially ironic, because here’s Meyer’s later definition of the “conditional expression”
if C then p else q:Where C’ = “the complement of C” (not defined, but later indicated to be “¬C”), ⌊p⌋ = “Rounded version of p” ⟨post_p / Pre_p, Pre_p⟩,
r / Xis the set{[x, y] ∈ R | x ∈ X]. I’m not fully clear on what it means to “round” a postcondition; he only defines “rounding” a whole program.Now compare this to what was so bad one could only “shake one’s head”:
Which is easier to understand? Which is easier to programmatically evaluate?
To be fair, we’ve had lots of training and years to grok the Kahn notation, and 12 hours to digest the Meyer one.
This is at least a very authentic Meyer paper, we can be pretty sure it’s not written by an LLM, which I find somewhat refreshing :)
Sounds like an interesting direction to me, rants about Hoare et al. notwithstanding, and cool to see Isabelle-proofs as part of the release.
I think
⌊post_p⌋ = post_⌊p⌋
Which I think is this?
post_p / (Pre_p ∩ C)
but yeah, rounding is defined in the article only for programs, not for postconditions.
I’m skeptical about what this article contributes to the field (has this really not been done before?), but I could believe that building a theory of computation directly on set theory would be convenient for reusing set theory ideas and tools in computation theory.
Nice. I have a few folders somewhere called
2024/and2025/inside are files01.mdetc. At the start of a new month I open up the right one and type:r !caland then each day I type<leader> tto insert today’s date before writing something down so it’s very easy for me to organize things under headings like:I open this file by adding a line to my bashrc:
alias jn='vim ~/Documents/$(date +"%Y/%m").md'so I don’t have to remember the month or year.I use these files to take notes which I never look at.
All notetaking software in a nutshell
Very much agree but id say that this current AI cycle is specifically GenAI, to distinguish it from other AI hypes that I suspect will happen in the future.
And AI hype cycles that happened in the past, like 5th Gen programming in the late 80’s
One day per line sucks if you wanna track more than 2 or 3 events per day.
some other obvious ux issues:
some of these can be addressed with tooling, but then wouldn’t you just rather use something without all the limitations?
Tbh, I don’t see why that limit is introduced. I just add extra lines for each new event, and his whole workflow still works perfectly fine
Grepability, maybe? With one day per line, searching for an event always gets the date of the event, too. (You could copy the event date for each line, but maybe that has some other problem?)
A markdown based format with a h2 for each day would be better imho. (h3 for each event)
If encoding of typical URLs doesn’t work really well, shouldn’t they make a new encoding version that is specialized for alphanumeric, / : ? & etc?
chicken-egg problem now. Who would want to use a QR code encoding that won’t work with the majority of QR code readers for only a very small gain, and how many reader implementations are actively maintained and will add support for something nobody uses yet?
The encoding could be invented for internal use by a very large company, kinda like UPS’s MaxiCode: https://en.m.wikipedia.org/wiki/MaxiCode
What you’re describing is the problem for all new standards. How do they ever work? ;-)
Better in environments that are not as fragmented and/or can provide backwards compatibility? ;)
The “byte” encoding works well for that. Don’t forget, URls can contain the full range of Unicode, so a restricted set is never going to please everyone. Given that QR codes can contain ~4k symbols, I’m not sure there’s much need for an entirely new encoding.
Yes, although… there’s some benefit to making QR codes smaller even when nowhere near the limits. Smaller ones scan faster and more reliably. I find it difficult to get a 1kB QR code to scan at all with a phone.
QR codes also let you configure the amount of error correction. For small amounts of data, you often turn up the error correction which makes them possible to scan with a very poor image, so they can often scan while the camera is still trying to focus.
IME small QR codes scan very fast even with the FEC at minimum.
URIs can contain the full Unicode range, but the average URL does not. Given that’s become a major use case for qrcodes it’s definitely a shame it does not have a better mode for them: binary mode needs a byte per byte while alnum only needs 5.5 bits per byte.
All non-ascii characters in URLs can be %-encoded so unicode isn’t a problem. A 6-bit code has room for lower case, digits, and all the URL punctuation, plus room to spare for space and an upper-case shift and a few more.
So ideally a QR encoder should ask the user whether the text is a URL, and should check which encoding is the smallest (alnum with %-encoding, or binary mode). Of course this assumes that any paths in the URL are also case-insensitive (which depends on the server).
Btw. can all HTTP servers correctly handle requests with uppercase domain names? I’m thinking about SNI, maybe certificate entries…? Or do browsers already “normalize” the domain name part of a URL into lowercase?
The spec says host names are case-insensitive. In practice I believe all (?) browsers normalize to lowercase so I’m not sure if all servers would handle it correctly but a lot certainly do. I just checked and curl does not normalize, so it would be easy to test a particular server that way.
Host name yes, but not path. So if you are making a URL that includes a path the path should be upper case (or case insensitive but encoded as upper case for the QR code).
No, a URI consists of ASCII characters only. A particular URI scheme may define how non-ASCII characters are encoded as ASCII, e.g. via percent-encoding their UTF-8 bytes.
Ok? You see how that makes the binary/byte encoding even worse right?
Furthermore, the thing that’s like a URI but not limited to ASCII is a IRI (Internationalized Resource Identifier).
You’re right (oh and I should know that URLs can contain anything…. Let’s blame it on Sunday :-))
Byte encoding is fun in practice, as I recently discovered, because Android’s default QR code API returns the result as a Java String. But aren’t Java Strings UTF-16? Why yes, and so the byte data is interpreted as UTF-8, then converted to UTF-16, and then provided as a string.
The work around, apparently, if you want raw byte data, is to use an undocumented setting to tell the API that the data is in an 8-bit code page that can be safely round tripped through Unicode, and then extract the data from the string by exporting it in that encoding.
I read somewhere that the EU’s COVID passes used text mode with base45 encoding because of this tendency for byte mode QR codes to be interpreted as UTF-8.
Do you really mean base45 or was that a typo for base64? I’m confused because base64 fits fine into utf8’s one-byte-per-character subset. :)
There’s an RFC: https://datatracker.ietf.org/doc/rfc9285/ and yes, base45, which is numbers and uppercase and symbols that matches the QR code “alphanumeric” set.
ahhh ty
Welcome to lobsters! Just a heads up, we have a community norm here that posters should limit self promotion (submitting things they worked on or wrote) to no more than a quarter of posts and comments. Are there any interesting Ada articles you’ve run into that are worth submitting?
EDIT: Ah I understand - I can submit another Ada post as well
No it’s fine! It’s just that a lot of people who come in don’t realize that lobsters has a much stronger norm about this than other places. One post isn’t a problem, two isn’t, five posts is. So we try to let people know this early so they’re aware.
Appreciate the advice 👍
I like Alloy, especially after they added the new temporal features in version 6.
Last time I played with it, it had to be used from inside some awful IDE though. I know this is a common complaint, my point is not to repeat it. What I wanted to say is that while I was browsing the download page for Alloy just now, I noticed that it links to Sterling: A Web-based Visualizer for Relational Modeling Languages. I was hoping it would allow me to use my own editor and do custom visualisations in a browser, this doesn’t seem to be the case, rather it’s a new web-based IDE.
However, as I was skimming through the Sterling paper, I noticed that it said that Sterling also “is the visualizer for an Alloy-like model finder called Forge, which is being developed at Brown University and is used to teach a Logic for Systems class of over 60 students”. So I looked up Forge, it appears to be a Racket “language”/eDSL which looks like Alloy. I’m still not sure what the exact difference is, their readme says “Forge is heavily adapted from the excellent Alloy, a more widely used and somewhat more scalable tool. Forge and Alloy even use the same engines!”, but also “For broader historical context on Forge, the Alloy documentation or an Alloy demo may be useful. We don’t suggest using these as reference for Forge itself, however.”. In either case, it seems Forge might be a way to get a taste of Alloy without the sour taste of their IDE!
Hi, I’m one of the authors of Alloy 6 and of this new book.
I think the image of the Alloy Analyzer your depicting is a bit too dark. The IDE is quite sophisticated since it features an interconnection of a solving engine, a rich visualizer as well as a very useful evaluator (to navigate instances and counterexamples). Compared with other formal methods, I think Alloy is quite good in terms of user-friendliness and usefulness of results it provides. The legacy integrated editor is on the other hand quite basic but, considering models in Alloy are rather small, the -real- annoyances remain limited in my view (the editor has syntax highlighting, error reporting, find&replace…). An interesting aspect for non-power users is that this is all delivered in a single, ready-to-run JAR file. This is in particular invaluable with students (Alloy is taught in several universities, at least in Europe and in America).
A future version will allow users to rely on LSP and use an editor of their choice (there were already some experiments with LSP and there actually is an unofficial VSCode plugin but it’s not up to date). As an “epxert” user myself, what I would not want, however, would be to lose the the above-mentioned sophisticated interactions present in the tool, for the sake of getting a slicker editing experience.
Alloy 6.2 now has a command line but it’s not that great for solving, more for automated build chains. Many serious Alloy users I know use the Vscode extension.
The new A6.2 command line indeed allows to execute the same kinds commands than in the GUI, with roughly the same options, so it’s mainly aimed at facilitating batch verification and benchmarks, I’d say.
For people who do not know the Alloy Analyzer completely, this is different from the Evaluator feature mentioned above, which allows to query, in Alloy itself, an instance or counter-example currently being shown by the Visualizer. This feature, which isn’t new by the way, is very useful in practice when models become complex.
Finally, the VSCode extension does not use Alloy analyzer 6.1 nor 6.2 but it ships a tailored version, AFAIK. It’s indeed already great but, hopefully, in the future, we will be able to get a plugin that can connect to the “real”, official Alloy.