I’m probably the perfect audience for nushell, since I do do a ton of personal ad-hoc scripting, I do hate the text-focused approach of typical shells and I (despite the syntax and verb-noun commands) really liked working with PowerShell when I used it.
The thing that makes me doubt stuff like nushell and similar is one big question: how
PowerShell works (on Windows, I have not used it outside of Windows) because MS owns the whole platform. All the basic cmdlets are owned by them (plus they didn’t go for compat with bat), the libraries they depend on are owned by them, etc. Being able to call into arbitrary libraries really sold me on PowerShell, because it opened up a whole world of scriptable stuff that was not doable in bash or similar.
On Linux especially (the BSDs and other UNIXen could take this approach with the base system if they wanted to), nobody owns anything. That means that the example of ls | size > 10kb is either calling a builtin replacement for ls, which may work differently or have different, incompatible flags; or it’s parsing the output of ls to turn it into an object, which is scary from a portability perspective. The latter especially is worrying for me, since I use a mix of OpenBSD, FreeBSD and macOS (work) typically, and even if they have a shared heritage, flags and output across base utilities have diverged. Even if there’s a totally functioning, portable wrapper around/replacement for ls, where does that stop being the case? How does trying to use data from ifconfig work across platforms?
My hesitancy to try something like nushell is not because I think it’s a bad idea. I love it, I’d love to never think about parsing text in shell again and the associated pitfalls. But I can’t help but wonder where all of the niceties will inevitably fall apart, what escape hatches exist, and how well that’ll work. Honestly I should probably just try it and find out for myself.
FWIW I glossed over this the first time around, and it seems like many HN commenters did too, but once you add this constraint, it takes many languages off the table – Java, basically all Lisps, Julia, OCaml, etc.
Ability to use static dispatch, static field offsets and stack allocation.
Control over memory layout - at minimum being able to express arrays-of-structs without pointer soup.
Not sure about APL and Forth, since I don’t think they really have complex data (nested) structures, or at least the language ergonomics around them are pretty bad.
So reading between the lines a bit, I believe the author is asking for an REPL’d, interpreted, and fast running Go, Swift, Rust, C++, or Zig. (which to me is an interesting problem – I basically use shell as my REPL for C++)
OCaml with new value types would qualify, and Java with value types would qualify, but they’re not there yet AFAIK.
I think Julia/Clasp can get that kind of performance in some cases, but it’s not predictable performance, with language control.
C# + LinqPad fits this bill. Not quite a REPL, but scripts run so quickly that it is near the same experience. I’ve used that for interactive coding quite a few times.
That does have the disadvantage of being Windows only, but it is worth a try.
It is possible to AOT C#, but that’s certainly not happening in LinqPad that I’m aware of, but Java is also nearly always JIT-ed, outside of Android, I think.
I wonder, as far as handles go, if you can make smart handles that implement some sort of refcounting. It’d add overhead, of course, but it’d let you re-activate item slots with confidence.
Granted, that may well be a problem that doesn’t need to be addressed for most games.
If the upper bound of your handles is smaller than the size of integer you’re trying to fit them in, you can use the remaining bits for (sticky) ref counts.
I’m also looking at Elixir and Phoenix (coming from Python/C++), and it looks cool. Doing realtime features with Python requires something like Tornado / asyncio, which isn’t ideal IMO.
I’m all for the immutable / message-passing architecture in the large, but I wish that you could just write Python-style code within processes. The VM copies all data structures at process boundaries anyway.
I think that language would be very popular!
I wrote Lisp 20 years ago, before Python, but now it feels a little annoying to “invert” all my code, for what seems like no benefit
defmodule Recursion do
def sum_list([head | tail], acc) do
sum_list(tail, acc + head)
end
def sum_list([], acc) do
acc
end
end
I would rather just write
def sum_list(L):
result = 0
for item in L:
result += item
return result
But then also have the Erlang VM to schedule processes and pass immutable messages.
There is obviously a builtin in Python that does this, and I’m sure there is a better way to do this in Elixir. But I think the general principle does hold.
fucking hell it’s on the front page of hackernews? Which of you jerks is responsible for that?! My server, it’s melting, it’s meeeeeelting!
…nah it’s actually doing fine. Gotta admit that HN seems the ideal venue for an “X for cynical curmudgeons” write-up, though I’d probably use stronger words if I were being honest about it.
Definitely a weird experience. Nice way to stress-test one’s infrastructure though. Apparently gitit occasionally flakes out under moderate load for uncertain reasons and just gives you a mystery error with Happstack saying “something died, whoops”, which under normal circumstances I see like… once every 2 years and which is fixed by reloading the page. But, throw a quarter million page views at it in 12 hours and the problem seems to crop up enough that it’s actually noticeable.
gitit has served me well for a long long time, and is actually lightly maintained again which was not the case for a few years, but I keep wanting to switch to something a little less fiddly to build and maintain. (Though once it is built it basically runs forever until you turn it off. There used to be occasional memory leaks but they seem to have gone away.) Unfortunately, it seems like writing wiki software that is simple and also works well is no longer fashionable, so if you’re not running PHP you have like… 4 plausible choices, none of which I like. I keep low-key meaning to write my own wiki software, but the user management and CMS bullshit are not actually terribly fun to implement.
Also a great fucking example of why I never read HN. Half the comments there are “Elixir is just a thin wrapper over Erlang” when the entire fucking point of the post is me saying “I thought the exact same thing but it’s way more than that, let me tell you exactly how!”
I might be mistaken, but I believe the use of recursion is to help preemption. The BEAM will give you a certain number of function calls before switching over to another process. Using recursion instead of loops keeps the functions small so things don’t hang.
(And if I am mistaken, I’m relying on Cunningham to correct me here.)
Spot on – loops would introduce the ability for a single process to hog a CPU, because the Erlang scheduler uses function calls as potential stopping points for the execution of each process, and a loop can be free of function calls.
Using recursion also eases hot upgrades, as a function can hand off its state to a more recent version of itself, thereby upgrading mid-loop. This only happens with module:function() (Erlang) or Module.function() (Elixir) calling syntax; function()-style calls without a module specified will not take upgrades into account.
Yeah I’ve heard that, and it makes sense that Elixir adheres to that constraint. I guess I’m thinking more if you were to design a concurrent language from scratch.
There’s always Go, but I’ve heard a lot of Go code doesn’t even use channels much, and as far as I know the data you pass over channels is mutable
Entirely up to you in both cases. Yes, you can get by with hardly using channels, especially if you’re writing a webapp, where your unit of concurrency is the request and requests effectively never interact with one another (except via the database or whatever). Yes, you can pass mutable things around using channels, but if you do then you have to think about ownership (effectively, the goroutine that receives a value via a channel is the only one that can safely even look at it until it relinquishes ownership by sending it to someone else or doing some other synchronizing thing like putting it in a sync.Map or storing it in a variable protected by a mutex) while if your data is immutable the question doesn’t even arise. So it’s definitely possible to learn that avoiding mutability, or at least limiting it to a few places, is good style.
Good question; where exactly DOES the scheduling happen? I thought the Go runtime was an example of a lang where the compiler inserted a “yield to scheduler” call every X loop iterations, but based on some searching of stackoverflow it appears I was mistaken. I tried to dig into the BEAM interpreter source code and it appears to have a yield instruction as well as yielding on specific operations like waiting for a message or exiting a process, but it also a lot of templates generating the actual code to interpret instructions so it’s hard to figure it out for sure.
If you control the compiler and VM, you can in principle interrupt lightweight threads whenever you feel like it. Just have a flag that gets checked after executing every instruction. It’s probably also possible to get the OS to help you out a little; something like make an OS thread that will sleep and then send a signal to the current process, and have the signal handler pause the VM in its tracks and change where it is currently executing. It’d suck, and suck 10x more with multithreaded execution, but you can do it. Hmmm.
Either way, Erlang and Elixir don’t even have loops, so. Recursion or bust!
Basically erlang is made up of BIF (Built In Function) which are written in C. Each BIF call is a yielding point. BIF can also yield mid flight in some case. BIF each have a “cost” counted in “reductions”. This is quite abstract, this cost is not directly linked to wall time.
Each process is given a reduction budget by its scheduler. Every BIF call check if the budget is used. If not, substract its reduction, run and next BIF.
Otherwise, schedule yourself out and tell the scheduler about it.
This make it look preemptive because from the pov of the user code it is. But at the runtime level it is cooperative at every basic op.
Good question; where exactly DOES the scheduling happen?
I’m a little hazy on the details, but the operative word to investigate is reductions. I think the term is borrowed from Prolog and represents a function invocation.
If you control the compiler and VM, you can in principle interrupt lightweight threads whenever you feel like it. Just have a flag that gets checked after executing every instruction.
I think it’s a bit simpler and more performant if you can check every time a function ends. Since you don’t have mutable, global state you just need to hold onto the next function and it’s arguments. Then you invoke them the next time the process needs to be scheduled. Though I don’t know how that works with stack traces.
Either way, Erlang and Elixir don’t even have loops, so. Recursion or bust!
Yeah, and if I’m remembering correctly that’s not just a philosophical choice, but useful for the implementation.
Again, I’m really talking at the edge of my knowledge here so please smite me if I’m wrong, gods of the Internet.
def sum_list(L), do: Enum.reduce(L, fn el, acc -> acc + el end )
(there are other ways too, depending on how kinky you’re feeling)
I’m not quite sure what you mean by “invert all your code”.
EDIT:
Given how much Python seems to hate functional programming and idioms, there is definitely an impedance mismatch when translating code over. That said, with many years of experience behind me, I can assure you that it’s quite easy to write hacky procedural code in Elixir if you want to.
Yeah I should have picked a different example – the idea I was trying to get across was that mutability within a loop is useful.
Here’s a similar question I asked with Python vs. OCaml and Lisp. Short-circuiting loops seems to cause the code to become fairly ugly in functional languages
You can do it pretty nicely with reduce_while (or plain recursion if preferred):
list = ~w[foo bar baz] # ~w is sigil for list of words
Enum.reduce_while(list, [], fn elem, acc ->
if String.starts_with?(elem, "b") do
{:halt, [String.upcase(elem) | acc]}
else
{:cont, acc}
end
end)
I definitely have had times where I’ve struggled a bit with the immutability of the BEAM when writing elixir, usually when modifying big bags of state.
It’s the needing to reach into structures and needing to copy modified structures that has given me the most grief so far. And there is stuff like Kernel.update_in and co, but yeah
I have found that Elixir can work for little scripting things quite nicely, on part due to being able to put multiple modules in a single file. There are still soem edfes, especially if you spin up a server inside said script
I have yet to do arg parsing in such a thing, though.
This is the current behavior, and it’s easy to grok, but I agree it doesn’t scale well. I’m still thinking through how I want to model streaming for larger or longer-lived processes. See also this thread
The most liberating thing for me was to start thinking about the RDF-compatible XML subset (that is, the data model of XML and why namespaces are helpful in a standardized protocol) and forget about the syntax.
Any XML with zero xmlns or being used ad-hoc with no spec is of course super cringe.
Sure, of course. The namespace is part of the name and you need to specify what you are actually selecting so you don’t select things from namespaces you don’t know about. That’s kind of the point
It’s not just needing to specify the name, it’s that for every time you’re setting up an XPath query using the XmlDocument class, you have to add a nametable.
The first answer here is an example of the fiddilness in .NET. It’s a lot when you compare to XPath to things like jq or Regex, and the failure mode if you don’t do that, or don’t know to do that is that your XPath query silently doesn’t return any results.
Like, I get why xmlns exists, but adding all of this ceremony to interacting with it adds to the Enterprise Bulk reputation that XML has.
You would rather the query syntax have used something like //{urn:the:namespace}someelement rather than using prefixes and having you specify a table? I’m sympathetic to that for sure, though I think it probably complicated the query syntax when many of the elements happen to be in the same namespace.
But the thing that really irks me is Douglas in the video said very vague assertions, which is fine for click-bait, but the author of this article is make a ton of assumptions of what those “bad foundations” are.
I am not a Douglas Crockford stan.. but a good place to start if you want to see the world from his perspective is his programming language “E”.
Another thing to pick apart, in the video he said “it may be the best language right now to do what it does, but we should do better.” That would imply we’re including all the modern languages in there.
Now for my take:
When we’re talking about application development, raw performance is the last characteristic that is interesting to me.
And the things that are holding us back from writing massive understandable applications isn’t just stronger core data structures and destructors, these are baby steps. We need to go beyond programming based on procedures and data structures.
raw performance is the last characteristic that is interesting to me.
Using a lang like Rust means I never have to encounter a case where the language does not meet requirements due to performance (unless it’s need to drop down to optimize assembly or something and I just don’t think I’ll ever really need that). Even though I don’t “need” the performance most of the time, it is nice knowing that it’s there if a problem comes up. With Ruby if you hit an issue it’s “write a C extension” but then I write Ruby because I don’t want to write C.
The other thing I think about is expressivity. If a language does not have semantics that allow me to express my low level performance requirements to the machine (so it can be optimized), what other kinds of things are hard to express?
We spent decades trying to decouple logic “compute” from memory only to come full circle as it turns out the two are deeply intertwined.
I don’t organize my tech choices by the 1% bottleneck in my system, I guess the difference between us is I don’t mind writing a C/Zig/Rust extension if 99% of my code can stay in ruby. I think we could find new ways to solve the 2-language problem, but I don’t believe its as simple to solve people think. You cannot build rails in rust, and rails is still more boilerplate & restrictive than I’d prefer, the system I want doesn’t exist yet but I know it wouldn’t be able to be written in rust.
I guess what I was trying to express is that the two language problem isn’t about performance but rather expression (for me).
I love that I have the capability to express to my computer “I want exclusive access to this input” (via a mutable reference) or that I can express “this function does not mutate its input” or “this function should not allocate”.
I am a ruby core contributor and maintain syntax suggest. After about a year of writing rust code I ran into an internal mutation bug in syntax suggest (written in Ruby) that cost me a few hours of my life. In Rust it would be impossible (and the default) to prevent that kind of logic bug because the code wouldn’t have even compiled. Yes, that is the same limitation that allows for not needing a GC, but it also has benefits beyond performance.
Im not advocating everyone using it for everything, but I’m saying you cannot avoid thinking about memory (in GC languages). It’s just a question of how much do you have to think about it and when.
Im not advocating everyone using it for everything, but I’m saying you cannot avoid thinking about memory (in GC languages). It’s just a question of how much do you have to think about it and when.
There’s an inescapable tradeoff in languages that are “smart” at some level where most of the time it figures out the optimizations on its own and it’s fine, but every once and a while it’s not quite performing as well as it could, and then you have to understand the underlying system to trick it into actually doing the optimization it needs. In Go, it typically takes the form of the developer knowing that something can be stack allocated and then jumping through some hoops to tell the compiler not to use heap allocation. In Rust, I think a more common gyration is when you know something could be done SIMD but have to make sure the compiler understands that by writing the loops out in a certain way. Databases keep you from having to write manual loops over your data, but you have to tell them the indices ahead of time and even change how you write a query to keep it on the indexed path. Lots of systems end up having this property.
Are we sure that we can’t have both, though? For example, Common Lisp allows incremental typing which acts as hints to the compiler, and the mature compilers will use them to generate very efficient machine code if they can. The v8 VM and Hotspot JIT for the JVM both come from work on heavily optimizing Smalltalk and Self (see Strongtalk).
I do think we can have both, I like the approach I see from Red with Red and Red/System.
I’ve been imagining in my own language a high level DSL to generate C code to interface into, but with analysis done to generate safe C and make it highly interoperable.. maybe a pipe dream, but I do think there’s a lot of unexplored space here.
Using a lang like Rust means I never have to encounter a case where the language does not meet requirements due to performance (unless it’s need to drop down to optimize assembly or something and I just don’t think I’ll ever really need that)
C is faster than rust though, just so everyone knows
I don’t think that’s necessarily the case. Why do you think that C is faster?
It’s a very broad topic with a lot of nuances but let me share a few short points.
On one hand Rust design enables fearless concurrency. Clear ownership makes it easier to write concurrent code and avoid costly synchronization primitives. Stricter aliasing rules give more optimization opportunities (modulo compiler backend bugs).
On the other hand, there are programs which are easy to express in C but writing them in Rust is painful (see Learn Rust With Entirely Too Many Linked Lists). The cost of array bounds checking is probably non-zero as well (although I haven’t seen a good analysis on this topic).
Marketing language like “fearless concurrency” tells us nothing about what Rust design enables or is good for. I’ve never been scared of concurrency, just annoyed by it. What practices or features does Rust afford the programmer that improves their experience writing concurrent code? This is something I haven’t yet heard.
On one hand Rust design enables fearless concurrency. Clear ownership makes it easier to write concurrent code and avoid costly synchronization primitives. Stricter aliasing rules give more optimization opportunities (modulo compiler backend bugs).
You seem to be arguing that rust is faster to write; I was talking about how fast the code runs. I suppose if you write concurrent C code with a similar amount of time and attention as writing the same program in rust, you could end up with slower code because writing C properly can take longer.
You seem to be arguing that rust is faster to write; I was talking about how fast the code runs. I suppose if you write concurrent C code with a similar amount of time and attention as writing the same program in rust, you could end up with slower code because writing C properly can take longer.
I agree with you on this. Thank you for putting it in clearer terms than I could.
For any higher level language than C, and for any problem, you can say that given enough time and enough attention you can write a faster solution in C. And then the same argument goes for assembly. And then the same argument goes for designing specialized hardware.
(It looks like a Turing tarpit of performance in a sense.)
That is very interesting and unexpected; I would retract my unqualified statement that C is faster than Rust if I could still edit the comment. Thanks for sharing.
The cost of array bounds checking is probably non-zero as well
It is a runtime instruction which adds overhead, however if you know the length of your inputs at compile time and can hint them to the compiler and it can prove you never go out of bounds then it will get rid of them.
but writing them in rust is painful
You could say the same the other way too. Having access to a hash map, iterators, growable vectors, and an ever growing list of libraries (crates) you can write code at a high level that has close performance to C without really trying. I did advent of code in Rust in 2021 and while I spent extra time with the borrow checker and memory management my code looked to be a similar abstraction level as if I had written it in Ruby.
We need to go beyond programming based on procedures and data structures.
Yes we do!
And that doesn’t mean we need to stop programming based on procedures and data structures. But data algorithms and data structures are only a small part of what we need to do, and we need ways to express these other things that we do (which I would claim are architectural). Our current “general purpose” programming languages are largely equivalent to ALGOL, the ALGOrithmic Language. They are DSLs for the domain of algorithms.
I think this goes to the heart of what Crockford was intending to communicate. We seem to not lift our eyes beyond the minutiae of language features to new mechanisms of abstraction and representation. It seems we are stuck with functional decomposition and data modeling as all there is and will ever be, with object orientation as just a means to organize code.
We need new ways to model relationships, interactions and constraints. New ways to represent general-specific and whole-part relationships. But more importantly even higher abstractions that let us better model systems into code particularly outside the domains of the computer and data sciences and computing infrastructure.
It’s important to note that Crockford, Mark Miller, and other E veterans deliberately steered ECMAScript to be more like E. From that angle, Crockford is asking for something better than ECMAScript, which is completely in line with the fact that “E” should be more like Joe-E, then E-on-Java, then Caja, then Secure ECMASCript…
When we’re talking about application development, raw performance is the last characteristic that is interesting to me.
There was a recent story about how Mojo takes Python and adds a sub-language for optimization. It seems like a similar approach would be great for JavaScript. WASM is great and all, but it suffers from the two language problem, but worse because of how sandboxed WASM is.
That is a problem with WASM, but the funny thing is that it started exactly like that – asm.js was a subset of JavaScript that the browser JITs could optimize to native performance. And that became WASM, which reduced the parsing burden with a binary code format.
The reason the subset approach works for Mojo is because it’s a language for machine learning. It’s meant for pure, parallel computation on big arrays of floats – not making DOM calls back to the browser, dealing with garbage-collected strings, etc.
The Mojo examples copy arrays back and forth between the embedded CPython interpreter and the Mojo runtime, and that works fine.
WASM also works fine for that use case – it’s linear memory with ints and floats. It would be slower because there’s no GPU access, and I think no SIMD. But you wouldn’t really have the two language problem – the simple interface works fine.
Machine learning is already distributed, and JavaScript <-> WASM (+ workers) is basically a distributed system.
asm.js was a subset of JavaScript that the browser JITs could optimize to native performance. And that became WASM, which reduced the parsing burden with a binary code format.
Wasn’t AssemblyScript the answer to the loss of asm.js? That was my understanding, but maybe I’m wrong.
I don’t understand. asm.js was a strict subset of JavaScript designed to be a compilation target. AssemblyScript is, supposedly, “designed with WASM in mind,” so presumably spiritually successive, given asm.js inspired WASM?
AssemblyScript is not any sort of successor to asm.js. It’s a programming language (not a compilation target) that uses TypeScript syntax and compiles to WebAssembly.
Though “for AI” is everyone’s tagline at the moment. I’m not sure I’d read too much in it. I just saw AWS advertising on my IAM login screen that I should store vectors in RDS Postgres with pgvector because AI.
My understanding is that Mojo is being led by Chris Latner, who after leaving Apple (where he started Swift) went to Tesla to work on self-driving cars and then did more AI stuff at Google. At one point, I think he was working on FFI from Swift to Python just for ML ecosystem. I think the AI part of the description is more than just pure marketing.
How does it get the structured data out of tools? Does it have parsers for them or rely on something like the libxo mode in most FreeBSD tools? The former seems quite fragile because it relies on the formats not changing. One of the big motivations for the libxo work in FreeBSD was Juniper appliances breaking when ifconfig output gained an extra field.
They have parsers for common formats such as json ('{ "foo": 1, "bar": 2 }' | from json) (see formats) but yeah, if you want to parse custom output from a 3rd party command, you will have to do it by hand.
By accepting a wide variety of structured data (up to and including SQLite databases), and providing builtins that wrap around a lot of commonish use cases (like ls) for shell bits.
For line oriented formats, it’s about on par with Awk, ish.
If you have a properly novel scenario, you can also write a plugin, which should give you access to libc if needed.
So, this is funny, but I’ve found myself using both the Godot and Tic-80 built in text editors for a lot of things, and only missing Vim keybinds at the extremes.
I realise that it’s not related to tic-80 but you reminded me…
I played a bit of TIS-100 on Steam recently where you program a series of extremely limited pseudo assemblers. I found the built-in editor to be so frustratingly not-vim that I played the game by finding the save files on my file system and editing them directly in vim, then reloading the game in Steam to run them.
If you want to write code this way, why would you even choose to use Python at all? I wouldn’t use typing for any of these examples. It’s overkill. Just write some tests and spend your energy on something more important.
Python has a large and rich ecosystem, and is a popular choice for starting new projects. New projects often eventually grow into large projects, where switching languages would require a significant amount of rewrite. Today, when you just wrote it, remembering whether find_item returns None or throws a KeyError may be easy to keep in your head, but as a system grows and expands (especially to FAANG-scale python codebases), or must be worked on by more than one engineer, strict typing is a must. Almost all Python code at my company is strictly typed, and we have millions and millions of lines of it. It truly makes things much easier in the long run, in exchange for typing out twelve extra characters before you push your commit.
As a counterpoint, I’ve built a 1MLOC+ codebase from zero in a “FAANG-scale” monorepo. More than half of it was Python without typing. We had no issues static typing would have solved. The arguments in favor of static typing seem worth it on the surface, but in practice you aren’t gonna need it unless it’s critical for performance gains (like Cython or Numba).
FWIW, I’ve had the exact opposite experience in the same situation (untyped Python “FAANG-scale” monorepo with millions of lines of code). I think static typing would have helped understand what code meant.
At one point, I wanted to modify a function that took an argument called image. But what attributes does that image have? It was passed by function after function, and so I spent a bunch of time tracing up the callstack until I found a place where the image value was generated. From there I looked at the attributes that that image had, and went on my merry way…except that the function that I wanted to modify was using duck typing, and there were multiple image types, each with different sets of attributes.
If static typing was in use, it’d be obvious what attributes I could access on the image. Less because of compile-time/static checking, but more because it would make it easier to figure out what functions were doing. Yes, this is a problem of documentation, but if I want functions to be documented with what attributes should be passed into them, we might as well do it in a computer-understandable way, so that the computer can check our work.
In my experience this is due to expecting to work in dynamic codebases the same as static. In a dynamic codebase I’d put a debugger & enter the repl & see what does the image actually has. This may seems roundabout, but in practice you see more than you can with types because not only can I see the properties, I can play with them & test out calling methods on it with the relevant information.
Except that duck typing means that what the image has could change from run to run, and if you need to access something outside the existing (assumed) interface, you might make a bad assumption.
This isnt a direct answer, but i think both these question miss the bigger picture of what we want to build towards:
Can we have both a dynamic&inspectable runtime with development tools to catch bugs during development with minimal effort. Type hints are a halfway solution, the future i dream of is a fully inferred static analysis system (idk if types are enough) that can understand the most dynamic code of python/ruby/js/clojure and let us know of potential problems we’ll encounter? Current gradual type systems are too weak in this regard, they don’t understand the flow of code, only “a may have x properties”.
For example:
a = {x: 10}
useX(a)
a = {y: 20}
useY(a)
Looking at this code, its clear this won’t crash, yet our type systems fail to understand shapes over time. The best we can currently do is {x: number} | {y: number}, which requires a check at each location.
Can we imagine a future where our tools don’t prescribe solutions, but trust you to do what you want & only point out what will fail.
All this being said, this may be bad code, but bad code that works is better than bad code that doesn’t work. This could also enable possibilities we cant dream of.
And then what we traditionally call “pair programming compiler” ala elm, can be lint rules.
I mean, Typescript does have an understanding of the execution of code, where it can do type narrowing, and it allows you to write your own type predicates for narrowing types down.
Shapes over time is definitely a thing typescript can do, though it can take a bit of convincing.
More than half of it was Python without typing. We had no issues static typing would have solved.
Every time I’ve had this discussion with someone it turns out they had tons of issues that static typing would have solved, they just didn’t realize it, or they were paying massive costs elsewhere like velocity or testing.
That said, Python’s type system is kind of shit so I get not liking it.
I work with a large progressively-typed codebase myself, and we frequently find thousands of latent bugs when we introduce new/tighter types or checks.
We had good test coverage, leveraging techniques like mutation testing (link1link2). The other half of the codebase used static types and didn’t have a higher level of productivity because of it. Once you have such a large and complex codebase, the fundamental issues become architectural, things like worst-case latency, fault tolerance, service dependency management, observability, etc.
Serious answer: types are, among other things, like very concise unit tests (I seem to recall a comment like “Types are a hundred unit tests that the compiler writes for you”, but I can’t find it now), but some bug might still slip through even a strong static algebraic dependent liquid flow quantitative graded modal type system, and tests are another level of defense-in-depth (and I don’t think any language has such a hypothetical type system — I’d like to see the one that does!).
I remember seeing someone (I think Hillel) explain that the sensible way to check whether a really complicated static analysis thingy (such as a very complicated static type signature) actually says what you think it says is to try applying it to some obviously-wrong code that it ought to reject and some obviously-right code that it ought to accept.
The idea of unit testing your static types is greatly amusing to me and feels like an obviously good idea.
If you’re forced to write Python, perhaps because you have a large existing Python codebase that would be prohibitive to port to another language, setting up a typechecker and using type annotations is an investment of energy that will greatly pay off over time in terms of minimizing bugs. I do agree that it would be better to not write code in Python at all, though, and choose a language that gives you better tools for managing your types from the get-go.
Having written an immense amount of Python and uplifting a massive commercial Python 2 codebase to fully type-hinted Python 3: there’s something to be said for being able to drop into a REPL, experiment, mess around, prototype, import your existing modules, mess around some more, and then lean on a static type checker once things are more solidified. It’s a nice workflow for exploratory programming.
Yes, I think adding types once an interface has completely solidified and has many dependencies could make sense because at that point it’s mature and you care more about stability than velocity. But starting with type hints when you’re first authoring a file undermines the advantage that Python provides for exploratory programming. That is what I’m against.
As a more ops-focused person, Python is still the language of choice for a lot of teams I’ve worked in. I’ve used patterns like this when we needed to write code that the team could easily pick up and maintain, but which also needed more explicit safety guarantees than the Python of 5 years ago might be expected to give you.
More concretely, why test for “this never happens” when you can simply make sure it can’t happen and then test that?
It’s only boring in a proprietary/Windows context. It hasn’t really taken off in FOSS/Linux environments AFAIK. But I could be wrong due to living in this tech sites bubble. We don’t get a lot of .NET submissions here either.
I mean, C#/.NET has been my breadwinning langauge for around a decade at this point, and “only boring in a Windows Context” ignores Unity, and the growing ASP.NET Core deployments in various flavors of Docker and/or Kubernetes. From a business side of things, I’ve worked at companies up and down the size spectrum that have a lot of code based in C#. It is, after all, used to run one of the most popular programming websites (Stack Overflow).
It’s not used much inside the Valley, but at this point, that’s mostly bias and inertia.
I had no idea Unity was a C# thing! I’m not in the Valley, but still very much in the FOSS/UNIX world and I’ve been avoiding the “traditional” tech companies (who would typically all use Java, C# and Windows) like the plague.
lmao this is a great callout; I hear amazing things about it (esp. LINQ) and think for years it was very ahead of Java (pre-Java 8); I think the omission is because I’ve spent too much time in high-growth VC companies that avoided .NET. 😛
I think it’s absolutely in the category though, and like Java, probably one of the ones I’d prefer.
A few years ago I tried using F# for an Advent of Code problem; I’m still curious to try it and may get back to it.
I think this works up to a point, but there are times when the shape of the data influences how you interact with it, especially in place like video games or involved UIs, one because of performance, and the other because of high interactions.
I’ve got at least 3 stories of spotting similar things once I threw some performance analysis at something.
One time a co-worker was building a 500mb CSV file in C# using strong concatenation. It was taking around 20 minutes? I was able to suggest using a StringBuilder (although, just streaming the results to a file would probably be better), and get it down significantly.
Another time, I was building out a stack based language (PISC, for anyone curious), and was wondering why some of my benchmarks were slow. Pulled out a performance analyzer, and realized that I was parsing numbers over and over when they were used as literals.
Most recently, I was builting a small Lisp in F#, and it was taking multiple seconds to parse and execute around a kilobyte of code. Digging into it more, and realized that I was unconditionally constructing unused error objects for every symbol. That path in F# turned out to be rather heavyweight.
One time a co-worker was building a 500mb CSV file in C# using strong concatenation. It was taking around 20 minutes? I was able to suggest using a StringBuilder (although, just streaming the results to a file would probably be better), and get it down significantly.
I had a similar issue with ETL jobs that were repeatedly concatenating pandas data frames. The input data was about 100 MB and it was crashing 16 GB RAM servers with OOM errors. Turns out the concatenation was n^2 like string concatenation. Those jobs were fractals of bad design.
My point being that, instead of the company laying off people, now it’s the agency hiring and firing people (even though they may be calling it differently).
True, but ideally a contracting company has more flexibility to find good niches that are less affected by those cycles than a product company does, and more power/experience to explore the market than an individual.
That’s a valid argument, but a contracting company is also burdened with making money for its managers, supporting staff and share holders and that overhead eats into the opportunity you mentioned. Based on my personal observations, seems to me like contracting company employees find themselves in even less stable circumstances during macroeconomic downturns.
Oh, certainly. I’ve never actually worked for a contracting company that was particularly worth it, but I’ve also only dipped my toes into it briefly a few times.
A lot of them lay off folks on the bench and younger folks. I’ve had several friends impacted this way–it’s just kinda assumed. When business picks back up, they’ll usually be able to get work there again.
That might not require changing the languages themselves!
For example, if you can have a $LANG <-> $L translation, where $L is a “compressed” version of $LANG optimized for model consumption, but which can be losslessly re-expanded into $LANG for human consumption, that might get you close enough to what you’d get from a semantically terser language that you’d rather continue to optimize for human consumption in $LANG.
So all those years of golfing in esolangs will pay off?? I’ve thought about this too, and you might be able to store more code in your context window if the embeddings are customized for your language, like a Python specific model compressing all Python keywords and common stdlib words to 1 or 2 bytes. TabNine says they made per-language models, so they may already exhibit this behavior.
Or perhaps there will be huge investment in important language models like python, and none for Clojure. I have a big fear around small languages going away - already it’s hard to find SDKs and standard tools for them.
I don’t think small languages will be going anywhere. For one thing, they’re small, which means the level of effort to keep them up and going isn’t nearly as large as popular ones. For another, FFI exists, which means that you often have access to either the C ecosystem, or the system of the host language, and a loooot of SDKs are open source these days, so you can peer into their guts and pull them into your libraries as needed.
Small languages are rarely immediately more productive than more popular ones, but the sort of person that would build or use one (hi!) isn’t going to disappear because LLMs are making the larger language writing experience more automatic. Working in niche programming spaces does mean you have to sometimes bring or build your own tools for things. When I was writing a lot of Janet, I ended up building various things for myself.
This already exists in C#, and has for some 5 odd years now. I think the relevant type is FormattableString
It is a good idea, and I do think we should try to get it into more languages.
It’s in JavaScript and Swift too, IIRC.
I’m probably the perfect audience for nushell, since I do do a ton of personal ad-hoc scripting, I do hate the text-focused approach of typical shells and I (despite the syntax and verb-noun commands) really liked working with PowerShell when I used it.
The thing that makes me doubt stuff like nushell and similar is one big question: how
PowerShell works (on Windows, I have not used it outside of Windows) because MS owns the whole platform. All the basic cmdlets are owned by them (plus they didn’t go for compat with bat), the libraries they depend on are owned by them, etc. Being able to call into arbitrary libraries really sold me on PowerShell, because it opened up a whole world of scriptable stuff that was not doable in bash or similar.
On Linux especially (the BSDs and other UNIXen could take this approach with the base system if they wanted to), nobody owns anything. That means that the example of
ls | size > 10kb
is either calling a builtin replacement forls
, which may work differently or have different, incompatible flags; or it’s parsing the output ofls
to turn it into an object, which is scary from a portability perspective. The latter especially is worrying for me, since I use a mix of OpenBSD, FreeBSD and macOS (work) typically, and even if they have a shared heritage, flags and output across base utilities have diverged. Even if there’s a totally functioning, portable wrapper around/replacement forls
, where does that stop being the case? How does trying to use data fromifconfig
work across platforms?My hesitancy to try something like nushell is not because I think it’s a bad idea. I love it, I’d love to never think about parsing text in shell again and the associated pitfalls. But I can’t help but wonder where all of the niceties will inevitably fall apart, what escape hatches exist, and how well that’ll work. Honestly I should probably just try it and find out for myself.
I recommend trying it.
ls is a nushell builtin yhat produces a table, though there is syntax to force using an executable that shares a name with a command.
As far as escape hatches, nushell has a pretty powerful parse command for row-basef data.
The networking/firewall side of things doesn’t really have builtins atm.
FWIW I glossed over this the first time around, and it seems like many HN commenters did too, but once you add this constraint, it takes many languages off the table – Java, basically all Lisps, Julia, OCaml, etc.
Not sure about APL and Forth, since I don’t think they really have complex data (nested) structures, or at least the language ergonomics around them are pretty bad.
So reading between the lines a bit, I believe the author is asking for an REPL’d, interpreted, and fast running Go, Swift, Rust, C++, or Zig. (which to me is an interesting problem – I basically use shell as my REPL for C++)
OCaml with new value types would qualify, and Java with value types would qualify, but they’re not there yet AFAIK.
I think Julia/Clasp can get that kind of performance in some cases, but it’s not predictable performance, with language control.
Java with value types: C#
C# + LinqPad fits this bill. Not quite a REPL, but scripts run so quickly that it is near the same experience. I’ve used that for interactive coding quite a few times.
That does have the disadvantage of being Windows only, but it is worth a try.
Yes good point! Although .NET is always JITted? Seems like he was against JITs in this problem definition
It is possible to AOT C#, but that’s certainly not happening in LinqPad that I’m aware of, but Java is also nearly always JIT-ed, outside of Android, I think.
I wonder, as far as handles go, if you can make smart handles that implement some sort of refcounting. It’d add overhead, of course, but it’d let you re-activate item slots with confidence.
Granted, that may well be a problem that doesn’t need to be addressed for most games.
If the upper bound of your handles is smaller than the size of integer you’re trying to fit them in, you can use the remaining bits for (sticky) ref counts.
It’s funny, because, for me, OTP was the hangup on being able to be more productive with Elixir more than the lispiness was.
I’m also looking at Elixir and Phoenix (coming from Python/C++), and it looks cool. Doing realtime features with Python requires something like Tornado / asyncio, which isn’t ideal IMO.
I’m all for the immutable / message-passing architecture in the large, but I wish that you could just write Python-style code within processes. The VM copies all data structures at process boundaries anyway.
I think that language would be very popular!
I wrote Lisp 20 years ago, before Python, but now it feels a little annoying to “invert” all my code, for what seems like no benefit
e.g. from https://learnxinyminutes.com/docs/elixir/
I would rather just write
But then also have the Erlang VM to schedule processes and pass immutable messages.
There is obviously a builtin in Python that does this, and I’m sure there is a better way to do this in Elixir. But I think the general principle does hold.
(copy of HN comment)
fucking hell it’s on the front page of hackernews? Which of you jerks is responsible for that?! My server, it’s melting, it’s meeeeeelting!
…nah it’s actually doing fine. Gotta admit that HN seems the ideal venue for an “X for cynical curmudgeons” write-up, though I’d probably use stronger words if I were being honest about it.
It’s a weird experience, isn’t it? I generally go to ground for a couple of days the few times that’s happened to me.
Definitely a weird experience. Nice way to stress-test one’s infrastructure though. Apparently
gitit
occasionally flakes out under moderate load for uncertain reasons and just gives you a mystery error with Happstack saying “something died, whoops”, which under normal circumstances I see like… once every 2 years and which is fixed by reloading the page. But, throw a quarter million page views at it in 12 hours and the problem seems to crop up enough that it’s actually noticeable.gitit
has served me well for a long long time, and is actually lightly maintained again which was not the case for a few years, but I keep wanting to switch to something a little less fiddly to build and maintain. (Though once it is built it basically runs forever until you turn it off. There used to be occasional memory leaks but they seem to have gone away.) Unfortunately, it seems like writing wiki software that is simple and also works well is no longer fashionable, so if you’re not running PHP you have like… 4 plausible choices, none of which I like. I keep low-key meaning to write my own wiki software, but the user management and CMS bullshit are not actually terribly fun to implement.Also a great fucking example of why I never read HN. Half the comments there are “Elixir is just a thin wrapper over Erlang” when the entire fucking point of the post is me saying “I thought the exact same thing but it’s way more than that, let me tell you exactly how!”
I might be mistaken, but I believe the use of recursion is to help preemption. The BEAM will give you a certain number of function calls before switching over to another process. Using recursion instead of loops keeps the functions small so things don’t hang.
(And if I am mistaken, I’m relying on Cunningham to correct me here.)
Spot on – loops would introduce the ability for a single process to hog a CPU, because the Erlang scheduler uses function calls as potential stopping points for the execution of each process, and a loop can be free of function calls.
Using recursion also eases hot upgrades, as a function can hand off its state to a more recent version of itself, thereby upgrading mid-loop. This only happens with
module:function()
(Erlang) orModule.function()
(Elixir) calling syntax;function()
-style calls without a module specified will not take upgrades into account.Here’s a lot more detail on Erlang’s scheduling logic: https://github.com/happi/theBeamBook/blob/master/chapters/scheduling.asciidoc
This looks like a fun read! I’ll have to check it out later. Thank you for sharing!
Elixir has loops (
for
is in Kernel.SpecialForms and isn’t really a macro), and they consume reductions.The tuple from
:erlang.statistics(:reductions)
is{total reductions, reductions since last call}
.for
isn’t really a loop. It is list comprehension with some additional bells and whistles.Yeah I’ve heard that, and it makes sense that Elixir adheres to that constraint. I guess I’m thinking more if you were to design a concurrent language from scratch.
There’s always Go, but I’ve heard a lot of Go code doesn’t even use channels much, and as far as I know the data you pass over channels is mutable
I think Pony and Monte might be of interest then?
Entirely up to you in both cases. Yes, you can get by with hardly using channels, especially if you’re writing a webapp, where your unit of concurrency is the request and requests effectively never interact with one another (except via the database or whatever). Yes, you can pass mutable things around using channels, but if you do then you have to think about ownership (effectively, the goroutine that receives a value via a channel is the only one that can safely even look at it until it relinquishes ownership by sending it to someone else or doing some other synchronizing thing like putting it in a sync.Map or storing it in a variable protected by a mutex) while if your data is immutable the question doesn’t even arise. So it’s definitely possible to learn that avoiding mutability, or at least limiting it to a few places, is good style.
Good question; where exactly DOES the scheduling happen? I thought the Go runtime was an example of a lang where the compiler inserted a “yield to scheduler” call every X loop iterations, but based on some searching of stackoverflow it appears I was mistaken. I tried to dig into the BEAM interpreter source code and it appears to have a
yield
instruction as well as yielding on specific operations like waiting for a message or exiting a process, but it also a lot of templates generating the actual code to interpret instructions so it’s hard to figure it out for sure.If you control the compiler and VM, you can in principle interrupt lightweight threads whenever you feel like it. Just have a flag that gets checked after executing every instruction. It’s probably also possible to get the OS to help you out a little; something like make an OS thread that will sleep and then send a signal to the current process, and have the signal handler pause the VM in its tracks and change where it is currently executing. It’d suck, and suck 10x more with multithreaded execution, but you can do it. Hmmm.
Either way, Erlang and Elixir don’t even have loops, so. Recursion or bust!
Basically erlang is made up of BIF (Built In Function) which are written in C. Each BIF call is a yielding point. BIF can also yield mid flight in some case. BIF each have a “cost” counted in “reductions”. This is quite abstract, this cost is not directly linked to wall time.
Each process is given a reduction budget by its scheduler. Every BIF call check if the budget is used. If not, substract its reduction, run and next BIF. Otherwise, schedule yourself out and tell the scheduler about it.
This make it look preemptive because from the pov of the user code it is. But at the runtime level it is cooperative at every basic op.
I’m a little hazy on the details, but the operative word to investigate is reductions. I think the term is borrowed from Prolog and represents a function invocation.
I think it’s a bit simpler and more performant if you can check every time a function ends. Since you don’t have mutable, global state you just need to hold onto the next function and it’s arguments. Then you invoke them the next time the process needs to be scheduled. Though I don’t know how that works with stack traces.
Yeah, and if I’m remembering correctly that’s not just a philosophical choice, but useful for the implementation.
Again, I’m really talking at the edge of my knowledge here so please smite me if I’m wrong, gods of the Internet.
Here’s an old and detailed HN comment which says that the BEAM is reduction scheduled.
or
(there are other ways too, depending on how kinky you’re feeling)
I’m not quite sure what you mean by “invert all your code”.
EDIT:
Given how much Python seems to hate functional programming and idioms, there is definitely an impedance mismatch when translating code over. That said, with many years of experience behind me, I can assure you that it’s quite easy to write hacky procedural code in Elixir if you want to.
Yeah I should have picked a different example – the idea I was trying to get across was that mutability within a loop is useful.
Here’s a similar question I asked with Python vs. OCaml and Lisp. Short-circuiting loops seems to cause the code to become fairly ugly in functional languages
https://old.reddit.com/r/ProgrammingLanguages/comments/puao1v/im_using_python_now_2013/he2syf8/
You can do it pretty nicely with
reduce_while
(or plain recursion if preferred):I definitely have had times where I’ve struggled a bit with the immutability of the BEAM when writing elixir, usually when modifying big bags of state.
It’s the needing to reach into structures and needing to copy modified structures that has given me the most grief so far. And there is stuff like
Kernel.update_in
and co, but yeahI struggled a bit with that and not having early-returns.
Comprehensions are really good, and if you need to mutate something in a for loop then
reduce:
works very well.Error handling and early returns map onto the
with
expression. Which is a bit alien, but quite good. I still miss early returns tho.I think while loops are best done with recursion or
Stream.unfold
. I usually prefer recursion.I have found that Elixir can work for little scripting things quite nicely, on part due to being able to put multiple modules in a single file. There are still soem edfes, especially if you spin up a server inside said script
I have yet to do arg parsing in such a thing, though.
This looks cool, but the semantic model looks like it pulls everything into memory to process it, would that be correct?
Still handy for jobs that fit in memory, ofc, but gets a little awkward for larger things, but that may not be a goal for you.
This is the current behavior, and it’s easy to grok, but I agree it doesn’t scale well. I’m still thinking through how I want to model streaming for larger or longer-lived processes. See also this thread
The most liberating thing for me was to start thinking about the RDF-compatible XML subset (that is, the data model of XML and why namespaces are helpful in a standardized protocol) and forget about the syntax.
Any XML with zero xmlns or being used ad-hoc with no spec is of course super cringe.
It’s kinda funny, because for me, xmlns has been one of the larger sources of frustration when trying to query XML documents from code.
How so?
The .NET implementation of XPath required you to set up each namespaces for each to be able to select any tags in a given namespace.
Sure, of course. The namespace is part of the name and you need to specify what you are actually selecting so you don’t select things from namespaces you don’t know about. That’s kind of the point
It’s not just needing to specify the name, it’s that for every time you’re setting up an XPath query using the XmlDocument class, you have to add a nametable.
The first answer here is an example of the fiddilness in .NET. It’s a lot when you compare to XPath to things like jq or Regex, and the failure mode if you don’t do that, or don’t know to do that is that your XPath query silently doesn’t return any results.
Like, I get why xmlns exists, but adding all of this ceremony to interacting with it adds to the Enterprise Bulk reputation that XML has.
You would rather the query syntax have used something like
//{urn:the:namespace}someelement
rather than using prefixes and having you specify a table? I’m sympathetic to that for sure, though I think it probably complicated the query syntax when many of the elements happen to be in the same namespace.So.. there’s a lot to unpack here.
But the thing that really irks me is Douglas in the video said very vague assertions, which is fine for click-bait, but the author of this article is make a ton of assumptions of what those “bad foundations” are.
I am not a Douglas Crockford stan.. but a good place to start if you want to see the world from his perspective is his programming language “E”.
Another thing to pick apart, in the video he said “it may be the best language right now to do what it does, but we should do better.” That would imply we’re including all the modern languages in there.
Now for my take:
When we’re talking about application development, raw performance is the last characteristic that is interesting to me.
And the things that are holding us back from writing massive understandable applications isn’t just stronger core data structures and destructors, these are baby steps. We need to go beyond programming based on procedures and data structures.
Using a lang like Rust means I never have to encounter a case where the language does not meet requirements due to performance (unless it’s need to drop down to optimize assembly or something and I just don’t think I’ll ever really need that). Even though I don’t “need” the performance most of the time, it is nice knowing that it’s there if a problem comes up. With Ruby if you hit an issue it’s “write a C extension” but then I write Ruby because I don’t want to write C.
The other thing I think about is expressivity. If a language does not have semantics that allow me to express my low level performance requirements to the machine (so it can be optimized), what other kinds of things are hard to express?
We spent decades trying to decouple logic “compute” from memory only to come full circle as it turns out the two are deeply intertwined.
I don’t organize my tech choices by the 1% bottleneck in my system, I guess the difference between us is I don’t mind writing a C/Zig/Rust extension if 99% of my code can stay in ruby. I think we could find new ways to solve the 2-language problem, but I don’t believe its as simple to solve people think. You cannot build rails in rust, and rails is still more boilerplate & restrictive than I’d prefer, the system I want doesn’t exist yet but I know it wouldn’t be able to be written in rust.
I guess what I was trying to express is that the two language problem isn’t about performance but rather expression (for me).
I love that I have the capability to express to my computer “I want exclusive access to this input” (via a mutable reference) or that I can express “this function does not mutate its input” or “this function should not allocate”.
I am a ruby core contributor and maintain syntax suggest. After about a year of writing rust code I ran into an internal mutation bug in syntax suggest (written in Ruby) that cost me a few hours of my life. In Rust it would be impossible (and the default) to prevent that kind of logic bug because the code wouldn’t have even compiled. Yes, that is the same limitation that allows for not needing a GC, but it also has benefits beyond performance.
Im not advocating everyone using it for everything, but I’m saying you cannot avoid thinking about memory (in GC languages). It’s just a question of how much do you have to think about it and when.
That’s a good point, thanks for clarifying.
There’s an inescapable tradeoff in languages that are “smart” at some level where most of the time it figures out the optimizations on its own and it’s fine, but every once and a while it’s not quite performing as well as it could, and then you have to understand the underlying system to trick it into actually doing the optimization it needs. In Go, it typically takes the form of the developer knowing that something can be stack allocated and then jumping through some hoops to tell the compiler not to use heap allocation. In Rust, I think a more common gyration is when you know something could be done SIMD but have to make sure the compiler understands that by writing the loops out in a certain way. Databases keep you from having to write manual loops over your data, but you have to tell them the indices ahead of time and even change how you write a query to keep it on the indexed path. Lots of systems end up having this property.
Are we sure that we can’t have both, though? For example, Common Lisp allows incremental typing which acts as hints to the compiler, and the mature compilers will use them to generate very efficient machine code if they can. The v8 VM and Hotspot JIT for the JVM both come from work on heavily optimizing Smalltalk and Self (see Strongtalk).
I do think we can have both, I like the approach I see from Red with Red and Red/System.
I’ve been imagining in my own language a high level DSL to generate C code to interface into, but with analysis done to generate safe C and make it highly interoperable.. maybe a pipe dream, but I do think there’s a lot of unexplored space here.
C is faster than rust though, just so everyone knows
Asm is faster than C though, just so everyone knows
certainly, but nobody is mistaken about that
I don’t think that’s necessarily the case. Why do you think that C is faster?
It’s a very broad topic with a lot of nuances but let me share a few short points.
On one hand Rust design enables fearless concurrency. Clear ownership makes it easier to write concurrent code and avoid costly synchronization primitives. Stricter aliasing rules give more optimization opportunities (modulo compiler backend bugs).
On the other hand, there are programs which are easy to express in C but writing them in Rust is painful (see Learn Rust With Entirely Too Many Linked Lists). The cost of array bounds checking is probably non-zero as well (although I haven’t seen a good analysis on this topic).
Marketing language like “fearless concurrency” tells us nothing about what Rust design enables or is good for. I’ve never been scared of concurrency, just annoyed by it. What practices or features does Rust afford the programmer that improves their experience writing concurrent code? This is something I haven’t yet heard.This Rust book explains it in meaningful terms: https://doc.rust-lang.org/book/ch16-00-concurrency.html
I don’t know the methodology here but I had this graph in mind: https://benchmarksgame-team.pages.debian.net/benchmarksgame/download/fastest.svg
You seem to be arguing that rust is faster to write; I was talking about how fast the code runs. I suppose if you write concurrent C code with a similar amount of time and attention as writing the same program in rust, you could end up with slower code because writing C properly can take longer.
All right, I see what you mean. That said, if we look at https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust.html, then we see that the fastest solution is in Rust 50% of the time.
I agree with you on this. Thank you for putting it in clearer terms than I could.
For any higher level language than C, and for any problem, you can say that given enough time and enough attention you can write a faster solution in C. And then the same argument goes for assembly. And then the same argument goes for designing specialized hardware.
(It looks like a Turing tarpit of performance in a sense.)
That is very interesting and unexpected; I would retract my unqualified statement that C is faster than Rust if I could still edit the comment. Thanks for sharing.
Will this do? https://lobste.rs/s/yibs3k/how_avoid_bounds_checks_rust_without
It is a runtime instruction which adds overhead, however if you know the length of your inputs at compile time and can hint them to the compiler and it can prove you never go out of bounds then it will get rid of them.
You could say the same the other way too. Having access to a hash map, iterators, growable vectors, and an ever growing list of libraries (crates) you can write code at a high level that has close performance to C without really trying. I did advent of code in Rust in 2021 and while I spent extra time with the borrow checker and memory management my code looked to be a similar abstraction level as if I had written it in Ruby.
Yes we do!
And that doesn’t mean we need to stop programming based on procedures and data structures. But data algorithms and data structures are only a small part of what we need to do, and we need ways to express these other things that we do (which I would claim are architectural). Our current “general purpose” programming languages are largely equivalent to ALGOL, the ALGOrithmic Language. They are DSLs for the domain of algorithms.
See also:
Why Architecture Oriented Programming Matters
Can Programmers Escape the Gentle Tyranny of call/return?
Glue: the Dark Matter of Software
And of course my current attempt at a fix:
Objective-S
Thank you for sharing these links, they are very insightful!
I think this goes to the heart of what Crockford was intending to communicate. We seem to not lift our eyes beyond the minutiae of language features to new mechanisms of abstraction and representation. It seems we are stuck with functional decomposition and data modeling as all there is and will ever be, with object orientation as just a means to organize code.
We need new ways to model relationships, interactions and constraints. New ways to represent general-specific and whole-part relationships. But more importantly even higher abstractions that let us better model systems into code particularly outside the domains of the computer and data sciences and computing infrastructure.
As far as I’m aware, Pony and Monte both follow in the tradition of E in one way or another.
It’s important to note that Crockford, Mark Miller, and other E veterans deliberately steered ECMAScript to be more like E. From that angle, Crockford is asking for something better than ECMAScript, which is completely in line with the fact that “E” should be more like Joe-E, then E-on-Java, then Caja, then Secure ECMASCript…
There was a recent story about how Mojo takes Python and adds a sub-language for optimization. It seems like a similar approach would be great for JavaScript. WASM is great and all, but it suffers from the two language problem, but worse because of how sandboxed WASM is.
That is a problem with WASM, but the funny thing is that it started exactly like that – asm.js was a subset of JavaScript that the browser JITs could optimize to native performance. And that became WASM, which reduced the parsing burden with a binary code format.
The reason the subset approach works for Mojo is because it’s a language for machine learning. It’s meant for pure, parallel computation on big arrays of floats – not making DOM calls back to the browser, dealing with garbage-collected strings, etc.
The Mojo examples copy arrays back and forth between the embedded CPython interpreter and the Mojo runtime, and that works fine.
WASM also works fine for that use case – it’s linear memory with ints and floats. It would be slower because there’s no GPU access, and I think no SIMD. But you wouldn’t really have the two language problem – the simple interface works fine.
Machine learning is already distributed, and JavaScript <-> WASM (+ workers) is basically a distributed system.
Wasn’t AssemblyScript the answer to the loss of asm.js? That was my understanding, but maybe I’m wrong.
I don’t think AssemblyScript could fill that niche since it is both not a superset of JavaScript and compiles entirely to WASM.
I don’t understand. asm.js was a strict subset of JavaScript designed to be a compilation target. AssemblyScript is, supposedly, “designed with WASM in mind,” so presumably spiritually successive, given asm.js inspired WASM?
AssemblyScript is not any sort of successor to asm.js. It’s a programming language (not a compilation target) that uses TypeScript syntax and compiles to WebAssembly.
Mojo doesn’t appear aimed at application development, their headline is “a new programming language for all AI developers.”
Though “for AI” is everyone’s tagline at the moment. I’m not sure I’d read too much in it. I just saw AWS advertising on my IAM login screen that I should store vectors in RDS Postgres with pgvector because AI.
My understanding is that Mojo is being led by Chris Latner, who after leaving Apple (where he started Swift) went to Tesla to work on self-driving cars and then did more AI stuff at Google. At one point, I think he was working on FFI from Swift to Python just for ML ecosystem. I think the AI part of the description is more than just pure marketing.
Urks?
Language confuses me, I spell phonetically - I believe I was looking for “irk”, updated
I like nushell a lot, heh. It’s replaced Linqpad as my “explore and munge data” tool of first resort.
How does it get the structured data out of tools? Does it have parsers for them or rely on something like the libxo mode in most FreeBSD tools? The former seems quite fragile because it relies on the formats not changing. One of the big motivations for the libxo work in FreeBSD was Juniper appliances breaking when ifconfig output gained an extra field.
They have parsers for common formats such as json (
'{ "foo": 1, "bar": 2 }' | from json
) (see formats) but yeah, if you want to parse custom output from a 3rd party command, you will have to do it by hand.https://i.imgur.com/VgeV6Vv.png
By accepting a wide variety of structured data (up to and including SQLite databases), and providing builtins that wrap around a lot of commonish use cases (like ls) for shell bits.
For line oriented formats, it’s about on par with Awk, ish.
If you have a properly novel scenario, you can also write a plugin, which should give you access to libc if needed.
So, this is funny, but I’ve found myself using both the Godot and Tic-80 built in text editors for a lot of things, and only missing Vim keybinds at the extremes.
I realise that it’s not related to tic-80 but you reminded me…
I played a bit of TIS-100 on Steam recently where you program a series of extremely limited pseudo assemblers. I found the built-in editor to be so frustratingly not-vim that I played the game by finding the save files on my file system and editing them directly in vim, then reloading the game in Steam to run them.
If you want to write code this way, why would you even choose to use Python at all? I wouldn’t use typing for any of these examples. It’s overkill. Just write some tests and spend your energy on something more important.
Python has a large and rich ecosystem, and is a popular choice for starting new projects. New projects often eventually grow into large projects, where switching languages would require a significant amount of rewrite. Today, when you just wrote it, remembering whether
find_item
returns None or throws a KeyError may be easy to keep in your head, but as a system grows and expands (especially to FAANG-scale python codebases), or must be worked on by more than one engineer, strict typing is a must. Almost all Python code at my company is strictly typed, and we have millions and millions of lines of it. It truly makes things much easier in the long run, in exchange for typing out twelve extra characters before you push your commit.As a counterpoint, I’ve built a 1MLOC+ codebase from zero in a “FAANG-scale” monorepo. More than half of it was Python without typing. We had no issues static typing would have solved. The arguments in favor of static typing seem worth it on the surface, but in practice you aren’t gonna need it unless it’s critical for performance gains (like Cython or Numba).
FWIW, I’ve had the exact opposite experience in the same situation (untyped Python “FAANG-scale” monorepo with millions of lines of code). I think static typing would have helped understand what code meant.
At one point, I wanted to modify a function that took an argument called
image
. But what attributes does thatimage
have? It was passed by function after function, and so I spent a bunch of time tracing up the callstack until I found a place where theimage
value was generated. From there I looked at the attributes that thatimage
had, and went on my merry way…except that the function that I wanted to modify was using duck typing, and there were multiple image types, each with different sets of attributes.If static typing was in use, it’d be obvious what attributes I could access on the image. Less because of compile-time/static checking, but more because it would make it easier to figure out what functions were doing. Yes, this is a problem of documentation, but if I want functions to be documented with what attributes should be passed into them, we might as well do it in a computer-understandable way, so that the computer can check our work.
In my experience this is due to expecting to work in dynamic codebases the same as static. In a dynamic codebase I’d put a debugger & enter the repl & see what does the image actually has. This may seems roundabout, but in practice you see more than you can with types because not only can I see the properties, I can play with them & test out calling methods on it with the relevant information.
Except that duck typing means that what the image has could change from run to run, and if you need to access something outside the existing (assumed) interface, you might make a bad assumption.
This isnt a direct answer, but i think both these question miss the bigger picture of what we want to build towards:
Can we have both a dynamic&inspectable runtime with development tools to catch bugs during development with minimal effort. Type hints are a halfway solution, the future i dream of is a fully inferred static analysis system (idk if types are enough) that can understand the most dynamic code of python/ruby/js/clojure and let us know of potential problems we’ll encounter? Current gradual type systems are too weak in this regard, they don’t understand the flow of code, only “a may have x properties”.
For example:
a = {x: 10} useX(a)
a = {y: 20} useY(a)
Looking at this code, its clear this won’t crash, yet our type systems fail to understand shapes over time. The best we can currently do is {x: number} | {y: number}, which requires a check at each location.
Can we imagine a future where our tools don’t prescribe solutions, but trust you to do what you want & only point out what will fail.
All this being said, this may be bad code, but bad code that works is better than bad code that doesn’t work. This could also enable possibilities we cant dream of.
And then what we traditionally call “pair programming compiler” ala elm, can be lint rules.
I mean, Typescript does have an understanding of the execution of code, where it can do type narrowing, and it allows you to write your own type predicates for narrowing types down.
Shapes over time is definitely a thing typescript can do, though it can take a bit of convincing.
Every time I’ve had this discussion with someone it turns out they had tons of issues that static typing would have solved, they just didn’t realize it, or they were paying massive costs elsewhere like velocity or testing.
That said, Python’s type system is kind of shit so I get not liking it.
How do you know this?
I work with a large progressively-typed codebase myself, and we frequently find thousands of latent bugs when we introduce new/tighter types or checks.
We had good test coverage, leveraging techniques like mutation testing (link1 link2). The other half of the codebase used static types and didn’t have a higher level of productivity because of it. Once you have such a large and complex codebase, the fundamental issues become architectural, things like worst-case latency, fault tolerance, service dependency management, observability, etc.
Why would I write tests when I have types? :^)
Serious answer: types are, among other things, like very concise unit tests (I seem to recall a comment like “Types are a hundred unit tests that the compiler writes for you”, but I can’t find it now), but some bug might still slip through even a strong static algebraic dependent liquid flow quantitative graded modal type system, and tests are another level of defense-in-depth (and I don’t think any language has such a hypothetical type system — I’d like to see the one that does!).
I remember seeing someone (I think Hillel) explain that the sensible way to check whether a really complicated static analysis thingy (such as a very complicated static type signature) actually says what you think it says is to try applying it to some obviously-wrong code that it ought to reject and some obviously-right code that it ought to accept.
The idea of unit testing your static types is greatly amusing to me and feels like an obviously good idea.
Don’t think I ever said this. It’s a really good idea though!
Hm must’ve misremembered, sorry. Cheers! :)
(Here are some links for anyone wondering what a strong static algebraic dependent liquid flow quantitative graded modal type system would be.)
A+ trolling but also the author might agree.
Saying that you don’t need types as long as you have tests is A+ trolling as well.
I’m only talking about Python, not in general.
“Talking about Python” is general enough with how big and diverse are the use cases and contexts involving Python.
If you’re forced to write Python, perhaps because you have a large existing Python codebase that would be prohibitive to port to another language, setting up a typechecker and using type annotations is an investment of energy that will greatly pay off over time in terms of minimizing bugs. I do agree that it would be better to not write code in Python at all, though, and choose a language that gives you better tools for managing your types from the get-go.
Having written an immense amount of Python and uplifting a massive commercial Python 2 codebase to fully type-hinted Python 3: there’s something to be said for being able to drop into a REPL, experiment, mess around, prototype, import your existing modules, mess around some more, and then lean on a static type checker once things are more solidified. It’s a nice workflow for exploratory programming.
Yes, I think adding types once an interface has completely solidified and has many dependencies could make sense because at that point it’s mature and you care more about stability than velocity. But starting with type hints when you’re first authoring a file undermines the advantage that Python provides for exploratory programming. That is what I’m against.
As a more ops-focused person, Python is still the language of choice for a lot of teams I’ve worked in. I’ve used patterns like this when we needed to write code that the team could easily pick up and maintain, but which also needed more explicit safety guarantees than the Python of 5 years ago might be expected to give you.
More concretely, why test for “this never happens” when you can simply make sure it can’t happen and then test that?
I don’t see anything un-pythonic about any of the examples there except if you consider typing in general to be so (which, fair…).
This is if you want to program Python in a very structured somewhat anal-retentive way.
I find it entertaining that C# isn’t on the list of boring languages listed here.
It’s only boring in a proprietary/Windows context. It hasn’t really taken off in FOSS/Linux environments AFAIK. But I could be wrong due to living in this tech sites bubble. We don’t get a lot of .NET submissions here either.
I mean, C#/.NET has been my breadwinning langauge for around a decade at this point, and “only boring in a Windows Context” ignores Unity, and the growing ASP.NET Core deployments in various flavors of Docker and/or Kubernetes. From a business side of things, I’ve worked at companies up and down the size spectrum that have a lot of code based in C#. It is, after all, used to run one of the most popular programming websites (Stack Overflow).
It’s not used much inside the Valley, but at this point, that’s mostly bias and inertia.
I had no idea Unity was a C# thing! I’m not in the Valley, but still very much in the FOSS/UNIX world and I’ve been avoiding the “traditional” tech companies (who would typically all use Java, C# and Windows) like the plague.
Godot also supports C# as a scripting language, fwiw.
Use of C# and use of Java don’t overlap much, if at all. Use of C# does overlap more than you’d expect with usages of RabbtiMQ, tho
lmao this is a great callout; I hear amazing things about it (esp. LINQ) and think for years it was very ahead of Java (pre-Java 8); I think the omission is because I’ve spent too much time in high-growth VC companies that avoided .NET. 😛
I think it’s absolutely in the category though, and like Java, probably one of the ones I’d prefer.
A few years ago I tried using F# for an Advent of Code problem; I’m still curious to try it and may get back to it.
I mean, the larger programming world owes C# the
async/await
syntax becoming popular, good, bad, or ugly.I think this works up to a point, but there are times when the shape of the data influences how you interact with it, especially in place like video games or involved UIs, one because of performance, and the other because of high interactions.
I’ve got at least 3 stories of spotting similar things once I threw some performance analysis at something.
One time a co-worker was building a 500mb CSV file in C# using strong concatenation. It was taking around 20 minutes? I was able to suggest using a StringBuilder (although, just streaming the results to a file would probably be better), and get it down significantly.
Another time, I was building out a stack based language (PISC, for anyone curious), and was wondering why some of my benchmarks were slow. Pulled out a performance analyzer, and realized that I was parsing numbers over and over when they were used as literals.
Most recently, I was builting a small Lisp in F#, and it was taking multiple seconds to parse and execute around a kilobyte of code. Digging into it more, and realized that I was unconditionally constructing unused error objects for every symbol. That path in F# turned out to be rather heavyweight.
I had a similar issue with ETL jobs that were repeatedly concatenating pandas data frames. The input data was about 100 MB and it was crashing 16 GB RAM servers with OOM errors. Turns out the concatenation was n^2 like string concatenation. Those jobs were fractals of bad design.
Does Postgres fall under the “database machines are more expensive to run in the cloud” rule of thumb?
How does the contract agency handle the boom and bust cycle?
My point being that, instead of the company laying off people, now it’s the agency hiring and firing people (even though they may be calling it differently).
Ideally, by rotating clients, I suspect.
The trouble is, these cycles tend to be global usually caused by macroeconomic factors.
True, but ideally a contracting company has more flexibility to find good niches that are less affected by those cycles than a product company does, and more power/experience to explore the market than an individual.
That’s a valid argument, but a contracting company is also burdened with making money for its managers, supporting staff and share holders and that overhead eats into the opportunity you mentioned. Based on my personal observations, seems to me like contracting company employees find themselves in even less stable circumstances during macroeconomic downturns.
Oh, certainly. I’ve never actually worked for a contracting company that was particularly worth it, but I’ve also only dipped my toes into it briefly a few times.
A lot of them lay off folks on the bench and younger folks. I’ve had several friends impacted this way–it’s just kinda assumed. When business picks back up, they’ll usually be able to get work there again.
It is common to charge enough so that they have enough money to keep their employees “on the bench” between contracts.
I assume by having its mass made buoyant by sheer number of clients who need it.
I think, paradoxically, we’re going to see more ultra-terse languages, so that the AI can store more context and you can save money on tokens.
That might not require changing the languages themselves!
For example, if you can have a $LANG <-> $L translation, where $L is a “compressed” version of $LANG optimized for model consumption, but which can be losslessly re-expanded into $LANG for human consumption, that might get you close enough to what you’d get from a semantically terser language that you’d rather continue to optimize for human consumption in $LANG.
So all those years of golfing in esolangs will pay off?? I’ve thought about this too, and you might be able to store more code in your context window if the embeddings are customized for your language, like a Python specific model compressing all Python keywords and common stdlib words to 1 or 2 bytes. TabNine says they made per-language models, so they may already exhibit this behavior.
Or perhaps there will be huge investment in important language models like python, and none for Clojure. I have a big fear around small languages going away - already it’s hard to find SDKs and standard tools for them.
I don’t think small languages will be going anywhere. For one thing, they’re small, which means the level of effort to keep them up and going isn’t nearly as large as popular ones. For another, FFI exists, which means that you often have access to either the C ecosystem, or the system of the host language, and a loooot of SDKs are open source these days, so you can peer into their guts and pull them into your libraries as needed.
Small languages are rarely immediately more productive than more popular ones, but the sort of person that would build or use one (hi!) isn’t going to disappear because LLMs are making the larger language writing experience more automatic. Working in niche programming spaces does mean you have to sometimes bring or build your own tools for things. When I was writing a lot of Janet, I ended up building various things for myself.
Timely that I’ve started learning https://mlochbaum.github.io/BQN :)
Perl’s comeuppance!