As far as I can remember, anything you can do on Deno Deploy also works with the Deno binary. The KV database is backed by SQLite when running that way, but you can also self-host a FoundationDB-based version (not sure if this is the same one that powers Deploy).
You’re essentially locked in. You could build all the infra yourself and run your Deno code in the open-source runtime, but that’s lock-in to me.
The nice things about Deno IMO aren’t secret, patent-encumbered enterprise features; they’re just existing ideas brought together and executed well, something I believe the open source world is fully capable of doing :)
Closures are difficult in languages with manual memory management. That’s why Zig does not have them. Rust have them, but they feel different from closures in GCed languages due to borrow checking and lifetimes.
I’m never sure what “expressive” means. Rust type system being much more advanced, it is of course possible to “express” and type check much more concepts in Rust. But Zig being more flexible, especially with comptime, it is possible to “express” algos and data structures that don’t fit within Rust type system.
Zig offers an original error handling system.
Zig has no macro, but Rust has no comptime :)
One important difference not mentioned in the post is that Rust is a mature and stable language while Zig is still being developed. Some projects such as TigerBeetle, Bun or Ghostty have decided to use Zig anyway, but it’s not for the faint of heart :)
I actually love the name. Celery with type safety is broccoli, because it’s better and safer for you!
I see you plan to support Kafka and Rabbit. Both very sensible choices given you already support Redis. If you’re looking to get adoption from the bottom-up instead of top-down, you might consider choosing postgres queue technology for all of us average-scalers since Redis already covers high-scalers.
PG as a broker is amazing from my limited experience (with python/procrastinate): with 10/20 million jobs a day at work it does not break a sweat and it makes observability, debuging and testing so much more easier than with celery/rabbitmq. I will not use anything else unless forced to :D
Same here. I don’t understand why people insist on using Redis or Rabbit as a store for job queues when most projects already use PostgreSQL or MySQL. Also, I can rely on transactional semantics when queuing jobs if they are stored in the same SQL database, which is a significant simplification, if I want to ensure that I’m never losing jobs.
I’ve had a lot of luck, on many projects, using a database (Usually PostgreSQL, MySQL in the past) as a mid-high latency job dispatch mechanism. It’s not a replacement for a message queue but it’s rare for people to actually desire/need the trade-offs of a message queue. Usually it’s just a job queue.
Back in ~2011/2012 I was juggling over a billion jobs (each an individual row) in a single jobs table on a MySQL RDS instance. It works fine if you know what you’re doing and if you don’t, well, MQs aren’t any more merciful than databases in liminal circumstances.
I love the name too :). I think that PG queues are very interesting and promising however I believe that interacting with PG would require a different approach than how I originally built the lib. However, I am open to contributions and if you want to try to post a draft PR for adding this in.
Hey, I’m just here to create more work, not do the work :)
I actually am not a rust user. I just like to spread the postgres-as-a-queue gospel around the internet when the opportunity arises. Single-datastore applications are a joy to work with.
Rust moving up so much and so quickly is definitely quite frustrating. min-ver‘s biggest advantage is that it puts the brakes on by default and I think this is a good thing. It is very deliberate in saying that you move up intentionally. I think it’s the better default for future ecosystems.
Since the resolver can be changed between editions it could be possible but the interactions between old and new crates would be very confusing. Generally though I think that the rust community does not like min-ver.
So many people talking about how bad Kubernetes is without any experience with it. I’m pretty sure many people also tried to host it themselves and got bitten so badly they add on top of it.
I’ve used managed Kubernetes for many projects and they just run fine.
It’s actually a relief to write down what you need for your app to run and it gets done.
You want SSD storage attached? No fiddling with terraform to provision the disk and attach it + ansible to configure your OS to do it. With k8s you write your persistent volume claim, attach it to your container, done.
You want something externally available? Just add an ingress. You don’t have to install $reverse_proxy, configure $reverse_proxy, attach public IP, …
You want secret management? It’s already in there. You don’t have to hook something yourself like setting env-vars hidden with permissions etc.
…
I mean, you have a learning curve, but many things you want are just present by default without you having to care about them. It’s a platform, you learn it once and you just have to use it then (I’m not talking about cluster operators etc. here).
Some people in the comments mention monitoring being an issue. I find that’s quite the opposite. You just have everything managed by default, you deploy a DS or a CRD with your monitoring stack you are sure that everything will be monitored. You won’t get the usual reply from monitoring team like “oh sorry the log file permission was wrong when we updated the box, sorry you weren’t paged for that incident”.
In my experience, k8s exemplifies the Turing Tarpit: everything is possible, nothing of interest is easy.
I’ve run into pretty major footguns for each of those examples, and I have plenty more war stories where that came from.
You want SSD storage attached? No fiddling with terraform to provision the disk and attach it + ansible to configure your OS to do it. With k8s you write your persistent volume claim, attach it to your container, done.
Oops, you wanted more than one process to have write access? Tough luck, you’ll have to use some kind of NFS mount instead
You want something externally available? Just add an ingress. You don’t have to install $reverse_proxy, configure $reverse_proxy, attach public IP, …
We had a couple teams inadvertently open up unauthenticated access to their apps due to a single-character typo in their YAML. whoops
You want secret management? It’s already in there. You don’t have to hook something yourself like setting env-vars hidden with permissions etc.
Oh wait, you want that integrated with your platforms Secrets Management? You’ll need to install a third-party operator for that
Oops, you wanted more than one process to have write access? Tough luck, you’ll have to use some kind of NFS mount instead
Pretty sure that’s as easy as setting a ReadWriteMany access mode?
We had a couple teams inadvertently open up unauthenticated access to their apps due to a single-character typo in their YAML. whoops
How does this happen (i.e., what was the typo)? Kubernetes ingress doesn’t even do auth out of the box…
Oh wait, you want that integrated with your platforms Secrets Management? You’ll need to install a third-party operator for that
That seems… fine? Why would you expect a platform to ship with support for a third-party secrets management platform? It’s not like you get automatic secrets management of any kind if you’re managing your own VMs. You still have to install and software, but instead you have to install and configure it on each individual box, and in the best case you’re still having to deal with Ansible or similar versus Kubernetes manifests.
Pretty sure that’s as easy as setting a ReadWriteMany access mode?
Not supported by GKE!
How does this happen (i.e., what was the typo)?
Missing -, which caused the AuthorizationPolicy to interpret what should have been one thing as multiple
Kubernetes ingress doesn’t even do auth out of the box…
Why would you expect a platform to ship with support for a third-party secrets management platform?
I’m responding to the parent’s contention:
I’ve used managed Kubernetes for many projects and they just run fine … many things you want are just present by default without you having to care about them
This is not an accurate description of what it’s like to deploy to a managed k8s system. It’s extremely difficult, time-consuming, and full of footguns. I dread having to make changes to my k8s deployments, because it always surprises me how difficult and tedious it is.
I’m 99% sure we were doing this several years ago with a Filestore storage backend. I set it up–I don’t remember it being particularly troublesome.
Missing -, which caused the AuthorizationPolicy to interpret what should have been one thing as multiple
Was the - a YAML list-item syntax designator? I’m not familiar with AuthorizationPolicies; I don’t think that’s part of Kubernetes or even GKE. Maybe some Kubernetes custom resource definition for some other Google cloud offering, but it feels odd to fault Kubernetes or GKE for something that isn’t part of either.
This is not an accurate description of what it’s like to deploy to a managed k8s system. It’s extremely difficult, time-consuming, and full of footguns. I dread having to make changes to my k8s deployments, because it always surprises me how difficult and tedious it is.
I think no matter what you do it’s difficult and tedious. If you’re managing VMs or hosts, you have pretty similar problems, but you end up building much of what Kubernetes is in your own bespoke way and your organization won’t be able to benefit from all of the documentation, training materials, and hiring pool that you get from using a relatively standard thing like Kubernetes. 🤷♂️
Tough luck, you’ll have to use some kind of NFS mount instead
Pretty sure that’s as easy as setting a ReadWriteMany access mode?
Not supported by GKE!
we were doing this several years ago with a Filestore storage backend
Filestore is NFS! And expensive as hell.
Multiple processes writing to the same storage is a thing computers have been doing longer than I’ve been alive!
I’m not saying you can’t work around this, I’m saying Kubernetes makes this harder.
Kubernetes ingress doesn’t even do auth out of the box…
it feels odd to fault Kubernetes or GKE for something that isn’t part of either
I’m responding to:
You want something externally available? Just add an ingress. You don’t have to install $reverse_proxy, configure $reverse_proxy, attach public IP, …
It’s not that simple! You have to bring your own authn/authz layer, which is significantly more complicated than a reverse proxy.
you end up building much of what Kubernetes is in your own bespoke way and your organization won’t be able to benefit from all of the documentation, training materials, and hiring pool that you get from using a relatively standard thing like Kubernetes.
Kubernetes doesn’t solve this problem! There are so many, many ways to do everything, your k8s cluster will be a special snowflake. You’ll constantly dig up documentation that conflicts with your situation because you didn’t do it exactly that way (cf. my AuthorizationPolicy bug—you haven’t run into this because we’re doing authn differently)
Yes, but you don’t really need to know or care that it’s NFS. And yes, it is expensive.
Multiple processes writing to the same storage is a thing computers have been doing longer than I’ve been alive!
You can do that on any Kubernetes distro without special integration, but you’ll be constrained in exactly the same way as the computers of olde. If you want networked storage, you’ll need to use a networked container storage interface driver.
I’m not saying you can’t work around this, I’m saying Kubernetes makes this harder.
I’m not sure this is true. If all you want is multiple processes writing to the same storage, just fork two processes in your container and have them write to the same storage. I suspect you mean multiple processes on different machines writing to the same storage, in which case you’ll need to configure a networked container storage interface just like you would have to set up a networked file system in the sans-Kubernetes case.
It’s not that simple! You have to bring your own authn/authz layer, which is significantly more complicated than a reverse proxy.
This confuses me–you need an authn/authz layer regardless of whether you’re using Kubernetes, right?
Kubernetes doesn’t solve this problem! There are so many, many ways to do everything, your k8s cluster will be a special snowflake.
Kubernetes is extensible, and to the extent that you extend it, it will be a special snowflake, but the default stuff is bog standard. For example, replicasets, deployments, jobs, cronjobs, services, secrets, configmaps, etc are all standard with kubernetes, and they’re all things you would have to build yourself for most real world applications. Your cluster won’t be exactly like another’s, but it will be a lot more similar than any two VM-based systems.
I agree. The API and tools for k8s users are quite nice and comprehensive. I think that what people are criticizing is that k8s is quite difficult to deploy and maintain, and it doesn’t scale down well to single-host systems, which can be useful for local dev, learning, and small deployments.
I think that what people are criticizing is that k8s is quite difficult to deploy and maintain
Maybe I’m just doing it wrong, but it often feels like managing native hosts is a lot more painful, at least if you want to do it “right” rather than just SSH-ing on and making manual changes. For my homelab Raspberry Pi cluster, I have to do some minimal host stuff to get my nodes to run Kubernetes, and from there I just set a few secrets and apply all of the manifests and things kind of just work. Recently I rebuilt my cluster from scratch and it was pretty easy.
it doesn’t scale down well to single-host systems, which can be useful for local dev, learning, and small deployments.
Can you elaborate on this? I do most of my local dev on Docker’s Kubernetes integration. I haven’t noticed any problems.
More generally, RAII is a feature that exists in tension with the approach of operating on items in batches, which is an essential technique when writing performance-oriented software.
This is a falsehood some people in the intersection of the data oriented design and C crowd love to sell. RAII works fine with batches, it’s just the RAII object is the batch instead of the elements inside. Even if the individual elements have destructors, if you have an alternative implementation for batches C++ has all the tools to avoid the automatic destructor calls, just placement new into a char buffer and then you can run whatever batch logic instead. Don’t try bringing this up with any of them though or you’ll get an instant block.
I do highly performance sensitive latency work with tighter deadlines than gamedev and still use destructors for all the long lived object cleanup. For objects that are churned aggressively avoiding destructor calls isn’t any different than avoiding any other method call.
Agreed, this post is making some wild claims that don’t hold up in my experience. I’m writing a high-performance compiler in Rust, and most state exists as plain-old datatypes in re-usable memory arenas that are freed at the end of execution. RAII is not involved in the hot phase of the compiler. Neither are any smart pointers or linked lists.
I simply find the argument unconvincing. Visual Studio has performance problems related to destructors => RAII causes slow software?
“Exists in tension” seems accurate to me. Yes, you can do batches with RAII, but in practice RAII languages lead to ecosystems and conventions that make it difficult. The majority of Rust crates use standard library containers and provide no fine grained control over their allocation. You could imagine a Rust where allocators were always passed around, but RAII would still constrain things because batching to change deallocation patterns would require changing types. I think the flexibility (and pitfalls) of Zig’s comptime duck typing vs. Rust traits is sort of analogous to the situation with no RAII vs. RAII.
I think it’s the case that library interfaces tend not to hand control of allocations to the caller but I think that’s because there’s almost never pressure to do so. When I’ve wanted this I’ve just forked or submitted patches to allow me to do so and it’s been pretty trivial.
Similarly, most libraries that use a HashMap do not expose a way to pick the hash algorithm. This is a bummer because I expect the use of siphash to cause way more performance problems than deallocations. And so I just submit PRs.
Yes. I write Zig every day, and yet it feels like a big miss, and, idk, populist? “But don’t just take my word for it.” Feels like too much trying to do ‘convincing’ as opposed to elucidating something neat. (But I guess this is kind of the entire sphere it’s written in; what does the “Rust/Linux Drama” need? Clearly, another contender!)
It doesn’t, but without it I don’t really see the post offering anything other than contention for the sake of marketing.
I spend somewhere between 2 to 8 hours a day working on my own projects. (“2” on days I also do paid work, but that’s only two days a week.) Zig has been my language of choice for four or five years now; you can see a list on my GitHub profile. A lot of my recent work with it is private.
Thank you! I really like it, and I’m a little sad that Rust — which I still use often, maintain FOSS software in, and advocate for happily! — has narrowed the conversation around lower-level general-purpose programming languages in a direction where many now reject out of hand anything without language-enforced memory safety. It’s a really nice thing to have, and Rust is often a great choice, but I don’t love how dogmatic the discourse can be at the expense of new ideas and ways of thinking.
I very much agree. A Zig program written in a data-oriented programming style, where most objects are referenced using indices into large arrays (potentially associated to a generation number) should be mostly memory safe. But I haven’t written enough Zig to confirm this intuition.
I don’t remember the arguments against RAII much (has been a few years since) but that Zig doesn’t have RAII feels like an odd omission given the rest of the language design. It’s somewhat puzzling to me.
Hm, it’s pretty clear RAII goes against the design of Zig. It could be argued that it’d be a good tradeoff still, but it definitely goes against the grain.
Zig requires keyword for control flow. RAII would be a single instance where control jumps to a user defined function without this being spelled out explicitly.
Zig doesn’t have operator overloading, and, more generally, it doesn’t have any sort of customization points for type behavior. «compiler automatically calls __deinit__ function if available” would the the sole place where that sort of thing would be happening
Idiomatic Zig doesn’t use a global allocator, nor does it store per-collection allocators. Instead, allocators are passed down to specific methods that need them as an argument. So most deinits in Zig takes at least one argument, and that doesn’t work with RAII.
I was unaware that Zig discourages holding on to the allocator. I did not spend enough time with Zig but for instance if you have an ArrayList you can defer .deinit() and it will work just fine. So I was assuming that this pattern:
var list = ArrayList(i32).init(heap_allocator);
defer list.deinit();
Could be turned into something more implicit like
var list = @scoped(ArrayList(i32).init(heap_allocator));
I understand that “hidden control flow” is something that zig advertises itself against, but at the end of the day defer is already something that makes this slightly harder to understand. I do understand that this is something that the language opted against but it still feels odd to me that no real attempt was made (seemingly?) to avoid defer.
But it very much sounds like that this pattern is on the way out anyways.
Zig’s std.HashMap family stores a per-collection allocator inside the struct that is passed in exactly once through the init method. Idk how that can be considered non-idiomatic if it’s part of the standard library.
Zig is a pre 1.0 language. Code in stdlib is not necessary idiomatic both because there’s still idiom churn, and because it was not uniformly audited for code quality.
As someone who doesn’t use Zig or follow it closely, both the fact that that change is being made and the reason behind it are really interesting. Thanks for sharing it here
Even if the individual elements have destructors, if you have an alternative implementation for batches C++ has all the tools to avoid the automatic destructor calls, just placement new into a char buffer and then you can run whatever batch logic instead.
I’ve never used placement new, so I don’t know about that, so my question is, how do you do that? Take for instance a simple case where I need a destructor:
If I have a bunch of elements that are both constructed at the same time, then later destroyed at the same time, I can imagine having a dedicated Element_list class for this, but never having used placement new, I don’t know right now how I would batch the allocations and deallocations.
And what if my elements are constructed at different times, but then later destroyed at the same time? How could we make that work?
Don’t try bringing this up with any of them though or you’ll get an instant block.
I think I have an idea about their perspective. I’ve never done Rust, but I do have about 15 years of C++ experience. Not once in my career have I seen a placement new. Not in my own code, not in my colleagues’ code, not in any code I have ever looked at. I know it’s a thing when someone mentions it, but that’s about it. As far as I am concerned it’s just one of the many obscure corners of C++. Now imagine you go to someone like me, and tell them to “just placement new” like it’s a beginner technique everyone ought to have learned in their first year of C++.
I don’t expect this to go down very well, especially if you start calling out skill issues explicitly.
I’ve never done Rust, but I do have about 15 years of C++ experience. Not once in my career have I seen a placement new. Not in my own code, not in my colleagues’ code, not in any code I have ever looked at.
I’m a little bit surprised, because I’ve had the opposite experience. Systems programming in C++ uses placement new all of the time, because it’s the way that you integrate with custom allocators.
In C++, there are four steps to creating and destroying an object:
Allocate some memory for it.
Construct the object.
Destruct the object.
Deallocate the memory.
When you use the default new or delete operators, you’re doing two of these: first calling the global new, which returns a pointer to some memory (or throws an exception if allocation fails) and then calling the constructor, then calling the destructor. Both new and delete are simply operators that can be overloaded, so you can provide your own, either globally, globally for some overload, or per class.
Placement new has weird syntax, but is conceptually simple. When you do new SomeClass(...), you’re actually writing new ({arguments to new}) SomeClass({arguments to SomeClass's constructor}). You can overload new based on the types of the arguments passed to it. Placement new is a special variant that takes a void* and doesn’t do anything (it’s the identity function). When you do new (somePointer) SomeClass(Args...), where somePointer is an existing allocation, the placement new simply returns somePointer. It’s up to you to ensure that you have space here.
If you want to allocate memory with malloc in C++ and construct an object in it, you’d write something like this (not exactly like this, because this will leak memory if the constructor throws):
template<typename T, typename... Args>
T *create(Args... args)
{
void *memory = malloc(sizeof(T));
return new (memory) T(std::forward<Args>(args)...);
}
This separates the allocation and construction: you’re calling malloc to allocate the object and then calling placement new to call the constructor and change the type of the underlying memory to T.
Similarly, you can separate the destruction and deallocation like this (same exception-safety warning applies):
In your example, std::unique_ptr has a destructor that calls delete. This may be the global delete, or it may be some delete provided by Foo, Bar, or Baz.
If you’re doing placement new, you can still use std::unique_ptr, but you must pass a custom deleter. This can call the destructor but not reclaim the memory. For example, you could allocate space for all three of the objects in your ‘object’ with a single allocation and use a custom deleter that didn’t free the memory in std::unique_ptr.
Most of the standard collection types take an allocator as a template argument, which makes it possible to abstract over these things, in theory (in practice, the allocator APIs are not well designed).
LLVM does arena allocation by providing making some classes constructors private and exposing them with factory methods on the object that owns the memory. This does bump allocation and then does placement new. You just ‘leak’ the objects created this way, they’re collected when the parent object is destroyed.
I’ve done very little systems programming in C++. Almost all the C++ code I have worked with was application code, and even the “system” portion hardly did any system call. Also, most C++ programs I’ve worked with would have been better of using a garbage collected language, but that wasn’t my choice.
This may explain the differences in our experiences.
Yup, that’s a very different experience. Most C++ application code I’ve seen would be better in Java, Objective-C, C#, or one of a dozen other languages. It’s a good systems language, it’s a mediocre application language.
For use in a kernel, or writing a memory allocator, GC, or language runtime, C++ is pretty nice. It’s far better than C and I think the tradeoff relative to Rust is complicated. For writing applications, it’s just about usable but very rarely the best choice. Most of the time I use C++ in userspace, I use it because Sol3 lets me easily expose things to Lua.
I think it very much also depends on the subset of C++ you’re working with, at a former job I worked on a server application that might have worked in Java with some pains (interfacing with C libs quite a bit), and in (2020?) or later it should have probably be done in Rust but it was just slightly older that Rust had gained… traction or 1.0 release. It was (or still is, probably) written in the most high-level Java-like C++ I’ve ever seen due to extensive use of Qt and smart pointers. I’m not saying we never had segfaults or memory problems but not nearly as many as I would have expected.
But yeah, I think I’ve never even heard about this placement new thing (reading up now), but I’m also not calling myself a C++ programmer.
Placement new is half the story, you also need to be aware that you can invoke destructors explicitly.
A trivial example looks like
char foo_storage[sizeof(foo)];
foo *obj = new (&foo_storage[0]) foo();
obj->do_stuff();
obj->~foo(); //explicitly invoke the destructor
If you want to defer the construction of multiple foos but have a single allocation you can imagine char foos_storage[sizeof(foo)*10] and looping to call the destructors. Of course you can heap allocate the storage too.
However, you mostly don’t do this because if you looking for something that keeps a list of elements and uses placement new to batch allocation/deallocation that’s just std::vector<element>.
Likewise if I wanted to batch the allocation of Foo Bar and Baz in Element I probably would just make them normal members.
class Element
{
Foo foo;
Bar bar;
Baz baz;
};
Each element and its members is now a single allocation and you can stick a bunch of them in a vector for more batching.
If you want to defer the initialization of the members but not the allocation you can use std::optional to not need to deal with the nitty gritty of placement new and explicitly calling the destructor.
IME placement new comes up implementing containers and basically not much otherwise.
Note that since C++20+ you should rather use std::construct_at and std::destroy_at since these don’t require spelling the type and can be used inside constexpr contexts.
You likely use placement new every day indirectly without realizing it, it’s used by std::vector and other container implementations.
When you write new T(arg) two things happen, the memory is allocated and the constructor runs. All placement new does is let you skip the memory allocation and instead run the constructor on memory you provide. The syntax is a little weird new(pointer) T(arg). But that’s it! That will create a T at the address stored in pointer, and it will return a T* pointing to the same address (but it will be a T* whereas pointer was probably void* or char*). Without this technique, you can’t implement std::vector, because you need to be able to allocate room for an array of T without constructing the T right away since there’s a difference between size and capacity. Later to destroy the item you do the reverse, you call the destructor manually foo->~T(), then deallocate the memory. When you clear a vector it runs the destructors one by one but then gives the memory back all at once with a single free/delete. If you had a type that you wanted to be able to do a sort of batch destruction on (maybe the destructor does some work that you can SIMD’ify), you’d need to make your own function and call it with the array instead of the individual destructors, then free the memory as normal.
I’m not trying to call anybody out for having a skill issue, but I am calling out people who are saying it’s necessary to abandon the language to deal with one pattern without actually knowing what facilities the language provides.
There are different ways you could do it but one way would be to have a template that you specialize for arrays of T, where the default implementation does one by one destruction and the specialization does the batch version. You could also override regular operator delete to not have an implementation to force people to remember to use a special function.
To be honest, any language with manual memory management would’ve worked out just fine.
Between C++, Rust, and Zig, I find that Zig has the most ergonomic way of swapping allocation schemes at will.
That, and the comptime bit I mentioned in the post lets me easily write AST queries (which I something I expect plugin authors to be doing a lot of).
I’m a bit surprised the article doesn’t mention the main issue with using threads for a high number of concurrent tasks: the memory used by each thread for its call stack.
Memory usage is not the main issue with threads. Memory usage of Go’s goroutines and threads are not that different. I think it’s like 4x-8x difference? Which is not small, of course, but given that memory for threads is only fraction of memory the app uses, it’s actually not that large in absolute terms. You need comparable amount of memory for buffers for TCP sockets and such.
As far as I can tell, the actual practical limiting factor for threads is that modern OSes, in default configuration, just don’t allow you to have many threads. I can get to a million threads on my Linux box if I sudo tweak it (I think? Don’t remember if I got to a million actually).
But if I don’t do OS-level tweaking, I’ll get errors around 10k threads.
Not touching on the rest of the comment because it’s not something I have extensive experience with, but I do want to point out that Go isn’t really the best comparison point since goroutines are green threads with growable stacks, so their memory usage is going to be lower than native threads. Any discussion about memory usage of threads probably also needs to account for overcommit of virtual vs resident memory.
All of this is moot in Rust due futures being stackless, so my understanding (I reserve the right to be incorrect) is that in theory they should always use less memory than a stackfull application.
That’s is precisely my point: memory efficiency is an argument for stackless coroutines over stackful coroutines, but it is not an argument for async io (in whichever form) over threads.
Sorry for reading and replying to your comment so late. The minimum stack size of a goroutine is 2 kB. The default stack size of a thread on Linux is often 8 MB. Of course, the stack size of most goroutines will be higher. And similarly, it is usually possible to reduce the stack size of a thread to 1 MB or less if we can guarantee the program will never need more. Is it how you concluded that the difference was somewhere around 4x-8x?
I like your point about the fact that the memory for buffers, usually allocated on the heap, should be the same regardless. Never thought of that :)
I know, but I’ve always wondered what happens when a program has hundreds of thousands of threads. Will the TLB become too large with a lot of TLB miss making the program slow? When the stack of a thread grows from 4 kB to 8kB, how is that mapped to physical memory? Does it mean there will 2 entries in the TLB, one mapping the first 4 kB, and another the second 4 kB? Or will the system allocate a contiguous segment of 8 kB, and copy the first 4 kB to the new memory segment? I have no idea how it works concretely. But I would expect these implementation “details” to impact performance when the number of threads is very large.
I did some reading and will try to answer my own questions :)
Q: Will the TLB become too large with a lot of TLB miss making the program slow?
A: The TLB is a cache and has a fixed size. So no, the TLB can’t become “too large”. But if the working set of pages becomes too large for the TLB, then yes there will be cache misses, causing TLB thrashing, and making the program slow.
Q: When the stack of a thread grows from 4 kB to 8kB, how is that mapped to physical memory?
The virtual pages are mapped to physical pages on demand, page per page.
Q: Does it mean there will 2 entries in the TLB, one mapping the first 4 kB, and another the second 4 kB?
A: Yes. At least this is the default on Linux, as far as I understand.
Q: Or will the system allocate a contiguous segment of 8 kB, and copy the first 4 kB to the new memory segment
No.
Q: I would expect these implementation “details” to impact performance when the number of threads is very large.
A: If the stacks are small (a few kB), then memory mapping and TLB thrashing should not be a problem.
It’s 8 megs of virtual memory. Physically, only a couple of pages will be mapped. A program that spawns a million threads will use dozens of megs not 8 gigs of RAM.
Correct, I keep forgetting about this. But assuming that each thread maps at least a 4 kB page, and that the program spawns a million threads, then it should use 1 million x 4 kB = 4 GB, and not dozens of megs? Or am I missing something?
The stack size isn’t a problem at all. Threads use virtual memory for their stacks, meaning that if the stack size is e.g. 8 MiB, that amount isn’t committed until it’s actually needed. In other words, a thread that only peaks at 1 MiB of stack space will only need 1 MiB of physical memory.
Virtual address space in turn is plentiful. I don’t fully remember what the exact limit is on 64 bits Linux, but I believe it was somewhere around 120-something TiB. Assuming the default stack size of 8 MiB of virtual memory and a limit of 100 TiB, the maximum number of threads you can have is 13 107 200.
The default size is usually also way too much for what most programs need, and I suspect most will be fine with a more restricted size such as 1 MiB, at which point you can now have 104 857 600 threads.
Of course, if the amount of committed stack space suddenly spikes to e.g. 2 MiB, your thread will continue to hold on to it until it’s done. This however is also true for any sort of userspace/green threading, unless you use segmented stacks (which introduce their own challenges and problems). In other words, if you need 2 MiB of stack space then it doesn’t matter how clever you are with allocating it, you’re going to need 2 MiB of stack space.
The actual problems you’ll run into when using OS threads are:
An increase in context switching costs, which may hinder throughput (though this is notoriously difficult to measure)
Having to tune various sysctl settings (e.g. most Linux setups will have a default limit of around 32 000 threads per process, requiring a sysctl change to increase that). Some more details here
Different platforms behaving widely differently when having many OS threads. For example, macOS had (not sure if this is still the case) a limit of somewhere around 2000 OS threads per OS process
The time to spawn threads isn’t constant and tends to degrade when the number of OS threads increases. I’ve seen it go up all the way to 500 milliseconds in stress tests
Probably more that I can’t remember right now
Of these, context switch costs are the worst because there’s nothing you as a user/developer can do about this, short of spawning fewer OS threads. There also doesn’t appear to be much interest in improving this (at least in the Linux world that I know of), so I doubt it will (unfortunately) improve any time soon.
In these runs, I’m seeing 18.19s / 26.91s ≅ 0.68 or a 30% speedup from going async. However, if I pin the threaded version to a single core, the speed advantage of async disappears:
So, currently I think I don’t actually know the relative costs here, and I choose not to believe anyone who claims that they know, if they can’t explain this result.
EDIT: to clarify, it very well might be that the benchmark is busted in some obvious way! But it really concerns me that I personally don’t have a mental model which fits the data here!
From what I understand, there are two factors at play (I could be wrong about both, so keep that in mind):
The time of a context switch is somewhere in the range of 1 to 2 microseconds
With more threads running, the number of context switches may increase
The number of context switches is something you might not be able to do much about, even with thread pinning. If you have N threads (where N is a large number) and you want to give them a fair time slice, you’re going to need a certain number of context switches to achieve that.
This means that we’re left with reducing the context switch time. When doing any sort of userspace threading, the context switch time is usually in the order of a few hundred nanoseconds at most. For example, Inko can perform a context switch in somewhere between 500 and 800 nanoseconds, and its runtime isn’t even that well optimized.
To put it differently, it’s not that context switching is slow, it’s that it isn’t fast enough for programs that want to use many threads.
Your two comments here are some of the best things I’ve read about the topic in a while! Consider writing a blog post about this whole thing! In particular,
With more threads running, the number of context switches may increase
Is not something I’ve heard before, and it makes some sense to me (though, I guess I still need to think about more — with many threads, most threads should be idle (waiting for IO, not runnable)).
I’d been waiting for someone who knew more about the kernel guts to comment, but I guess that’s not going to happen, so here goes.
The context switching cost shouldn’t depend on the number of threads that exist, although there were one or two Linux versions in the early 2000s with a particularly bad scheduler where it did. I don’t buy that the number of context switches (per unit time) increases with the number of threads either in most cases; in a strongly IO-bound program it will depend solely on the number of blocking calls, and when CPU-bound it will be limited by the minimum scheduling interval*.
I am not convinced about the method in the repo you linked. Blocking IO and async are almost the same thing if you force them to run sequentially. Whether this measures context switch overhead fairly is beyond my ken, but I will say that a reactor that only ever dispatches one event per loop is artificially crippled. It’s doing all its IO twice.
Contrary to what one of the GH issues says, though, it’s probably not doing context switches. Like pretty much any syscall epoll_wait isn’t a context switch unless it has to actually wait.
This isn’t a degenerate case for blocking IO and that’s enough to make up for a bit of context switching. I think that’s all there is to it.
In general, though, the absolute cost of blocking IO is lower than I think almost everyone assumes. Threads that aren’t doing anything only cost memory (to the tune of a few KiB of stack) and context switches are usually a drop in the ocean compared to whatever work your program is actually doing. I think a better reason to avoid lots of threads is the loss of control over latency distribution. Although terrible tail latency with lots more threads than cores is often observed, I don’t know that I’ve ever read a particularly convincing explanation for this.
* Although that is probably too low (i.e. frequent) by default.
RAM is a lot cheaper nowadays than it was when c10k was a scaling challenge; 10,000 connections * 1 MiB stack is only ~10 GiB. Even if you want to run a million threads on consumer-grade server hardware (~ 1 TiB), the kinds of processes that have that kind of traffic (load balancers, HTTP static assets, etc) can usually run happily with stack sizes as small as 32 KiB.
RAM is a lot cheaper nowadays than it was when c10k was a scaling challenge
That just means that the bar should be higher now: c10M or maybe c100M.
As for running OS threads with a small stack size, why should we have to tune that number when, with async/await, the compiler can produce a perfectly sized state machine for the task?
That just means that the bar should be higher now: c10M or maybe c100M.
If your use case requires handling ten million connections per process, then you should use the high-performance userspace TCP/IP stack written by the hundred skilled network engineers in your engineering division.
Don’t try to write libraries to solve problems at any level of scale. Use a simple library to solve simple problems (10,000 connections on consumer hardware), and a complex library to solve complex problems (millions of connections on a 512-core in-house 8U load balancer).
As for running OS threads with a small stack size, why should we have to tune that number when, with async/await, the compiler can produce a perfectly sized state machine for the task?
Because you’ll need to tune the numbers anyway, and setting the thread stack size is a trivial tuning that lets you avoid the inherent complexity of N:M userspace thread scheduling libraries.
This is exciting, and I’m looking forward to trying it in place of iTerm.
What are the security goals and architecture? Lots of features means lots of attack surface, exposed to remote servers and files accidentally cat’d to the terminal. For spatial memory safety, does it/can it use Zig’s release-safe mode? For temporal memory safety, what allocation strategy is used? Are the internals friendly to targeted fuzzing? Are there dangerous features exposing command execution or the filesystem? Is bracketed paste supported adversarially? /cc @mitchellh
To be clear, this is not a list of gotchas, but of potential wins over other emulators that would make me particularly excited about switching.
Great questions. Ghostty 1.0 will be just as vulnerable to terminal escape security issues as basically any other terminal (as described in recent blackhat talks and so on).
We have some rudimentary protections common to other terminals (but also notably missing from many) such as an “unsafe paste warning” where we detect multi line pastes or pastes that attempt to disable bracketed paste and so on. It’s not sufficient to call a terminal secure by any means but does exist.
On Linux we have the ability to put shell processes into cgroups and currently do for memory protections but stop there. When I implemented that feature I noted I’d love to see that expanded in the future.
I think the real answer though is that this is one of my top goals for future improvements to the terminal sequence level as I noted in the future section. I’m working on a design now but I want to enable shells to drop privileges while children are running much in the same way OpenBSD has a syscall to drop privileges. For example, a shell or program should be able to say “ignore all escape sequences” (maybe except some subset) so things like tailing or catting logs are always safe.
The security framework is probably my first major goal for innovation in the future. For Ghostty 1.0, I’ve more or less inherited all the same architectural problems given by the underlying specifications of old (defacto and dejure).
If you’d like to be a part of this please email me, I have a ton of respect for your work and consider you far more of a security expert than me!
I realized I didn’t answer some of your other questions in my other response. Let me follow up with that now:
For memory safety, we currently recommend ReleaseFast. The speed impact of the safety checks on the safe build are too great and noticeably make some things not smooth. We need to do a better job of strategically implementing safety check disabling on certain hot paths to make safe builds usable, but that is something I want to do.
Our debug builds (not release safe) have extra memory integrity checks that make builds VERY slow but also VERY safe. I’d like to setup build machines that run fuzz testing on these builds 24/7 using latest main. This is not currently in place (but the integrity checks are in place).
Re: fuzzing: the internals are very friendly to fuzzing. We’ve had community members do fuzzing (usually with afl) periodically and we’ve addressed all known fuzz crashes at those times. It has been a few months since then so I’m not confident to say we’re safe right now. :)
Longer term, I’m interested in making an entire interaction sequence with Ghostty configurable in text so we can use some form of DST and fuzzing to continuously hammer Ghostty. That’s a very important goal for me.
For memory safety, we currently recommend ReleaseFast. The speed impact of the safety checks on the safe build are too great and noticeably make some things not smooth.
I’m not super familiar with the intricacies of Zig, but does it have any way to toggle individual safety checks? IME there are things like bounds checks that are basically free and massively help with safety, and then you have things like null checks that aren’t as useful for safety but are still basically free, and then you have things like checked addition that are comparatively expensive but don’t really give as much on their own. That might be something to check out?
Can someone who is knowledgeable in both Zig and Rust speak to which would be “better” (not even sure how to define that for this case) to learn for someone who knows Bash, Python and Go, but isn’t a software developer by trade? I’m an infrastructure engineer, but I do enjoy writing software (mostly developer tooling) and I’m looking for a new language to dip my toes into.
I second this and will also add that Zig’s use of “explicit” memory allocation (i.e. requiring an Allocator object anytime you want to allocate memory) will train you to think about memory allocation patterns in ways no other language will. Not everyone wants to think about this of course (there’s a reason most languages hide this from the user), but it’s a useful skill for writing high performance software.
I think the “both” answer is kinda right, which annoys me a little, because it is a lot to learn. But I can accept that we’ll have more and more languages in the future – more heterogeneity, with little convergence, because computing itself is getting bigger and diverse
e.g. I also think Mojo adds significant new things – not just the domain of ML, but also the different hardware devices, and some different philosophies around borrow checking, and close integration with Python
And that means there will be combinatorial amounts of glue. Some of it will be shell, some will be extern "C" kind of stuff … Hopefully not combinatorial numbers of build systems and package managers, but probably :-)
Whichever the case, you need to learn to appraise software yourself, otherwise you will have to depend on marketing pitches forever.
Try both, I usually recommend to give Rust a week and Zig a weekend (or any length of time you deem appropriate with a similar ratio), and make up your own mind.
If you’re new to low-level programming in general then Rust will almost certainly be easier for you – not easy, but easier.
Zig is a language designed by people who love the feeling of writing in C, but want better tooling and the benefit of 50 years of language design knowledge. If Rust is an attempt at “C++ done right”, Zig is maybe the closest there is right now to “C done right”. The flip side to that is part of the C idiom they cherish is being terse to the point of obscurity, and having relatively fewer places where the compiler will tell you you’re doing something wrong.
IMO the best ordering is Rust to learn the basics, C to learn the classics, and then Zig when you’ve written enough C to get physically angry at the existence of GNU Autotools.
I would also recommend “Learn Rust the Dangerous Way” once you know C (even if you already know Rust by then), to learn how to go from “C-like” code to idiomatic Rust code without losing any performance (in fact, gaining). It’s quite enlightening to see how you can literally write C code in Rust, then slowly improve it.
The quote doesn’t say that he intends it to replace C++, just that he wants to use it for problems he previously used C++ for
That is a very important distinction, because I’m very sure there are lots of C++ programmers who like programming with more abstraction and syntax than Zig will provide. They’ll prefer something closer to Rust
I’m more on the side of less abstraction for most things, i.e. “plain code”, Rust being fairly elaborate, but people’s preferences are diverse.
BTW Rob Pike and team “designed Go to replace C++” as well. They were writing C++ at Google when they started working on Go, famously because the compile times were too long for him.
That didn’t end up happening – experienced C++ programmers often don’t like Go, because it makes a lot of decisions for them, whereas C++ gives you all the knobs.
I was asked a few weeks ago, “What was the biggest surprise you encountered rolling out Go?” I knew the answer instantly: Although we expected C++ programmers to see Go as an alternative, instead most Go programmers come from languages like Python and Ruby. Very few come from C++.
Some people understand “replacement” to mean, “it can fill in the same niche”, while others mean, “it works with my existing legacy code”.
I always interpreted it to mean the former, so to me Zig is indeed a C++ replacement. As in, throw C++ in the garbage can, stop using it forever, and use Zig instead. Replace RAII with batch operations.
To the world: Your existing C++ code is not worth saving. Your C code might be OK.
As a university student, I’d prefer Zig more. Zig is easier to learn (it depends) and for me, I can understand some knowledge deeper when writing Zig code and using Zig libraries. Rust has higher level of abstraction which prevents you to touch some deeper concepts to some content. Zig’s principle is to let user have direct control over the code they write. Currently Zig’s documentation isn’t detailed, but the codes in std library is every straightforward, you can read it without enabling zls language server or you can use a text editor with only code highlighting feature to have a comfortable code reading expreience.
I am not an expert in Zig, but there was a thread by the person maintining the Linux kernel driver for the new apple that was written in rust about rust and zig here:
More specifically, if you’re coming from Python and Go in particular, I think you will enjoy Rust’s RAII and lifetime semantics more. Those are roughly equivalent to Python’s reference counting at compile time (or at runtime if you need to use Rc/Arc). It all ends up being a flavor of automatic memory management, which is broadly comparable to Go’s GC too. And Rust gives you the best of both worlds: 100% safe code by default (like Python, in fact, even stronger since Python lets you write “high-level, memory safe” data races without thinking but Rust makes it more explicit) and equal or higher performance than Go, with fast threading.
Zig sounds more aimed towards folks that come from C, and don’t want to jump into the “let the compiler take care of things for me” world. That said, I’m not experienced with Zig by any means, so you might want to hear from someone who is.
Regarding the original post, what if de-initialization can fail? I always found RAII to be relatively limited for reasons like that
It shouldn’t always be silent/invisible.
And I feel like if RAII actually works, then your resource problem was in some sense “easy”.
I’m not sure if RAII still works with async in Rust, but it doesn’t with C++. Once you write (manual) async code you are kind of on your own. You’re back to managing resources manually.
I googled and found some links that suggest there are some issues in that area with Rust:
And I feel like if RAII actually works, then your resource problem was in some sense “easy”.
Then why do people fuck up so much?
I’m not sure if RAII still works with async in Rust, but it doesn’t with C++. Once you write (manual) async code you are kind of on your own. You’re back to managing resources manually.
If the resource doesn’t need any asynchronous operations to be freed, works great. Which is to say, 99% of resources will still be handled by RAII.
I read through it, and as someone who has used both that whole thread is not arguing well for zig, only for rust, it has a lot of trolls in it that probably just are after lina (I know there are multiple) Most of us who prefer zig to rust are not deranged loonies like many in that conversation.
This post on why someone rewrote their Rust keyboard firmware in Zig might help you understand some of the differences between the two languages: https://kevinlynagh.com/rust-zig/
You’ll probably get along easier with Rust, but Zig might just bend your mind a little more. You need a bit more tolerance of bullshit with Zig since there’s less tooling, less existing code, and you might get stuck in ways that are new, so your progress will likely be slower. (I have one moderately popular library in Rust, but spend all my “free” time doing Zig, which I think demonstrates the difference nicely!)
I guess I think of what’s involved with learning to write Rust as more of an exercise (learn the rules of the borrow checker to effectively write programs that pass it), whereas imo with Zig there’s some real novelty in expressing things with comptime. It of course depends on your baseline; maybe sum types are new enough to you already.
One of the things I dislike about Rust’s documentation and educational material the most is that it’s structured around learning the rules of the borrow checker to write programs that pass it (effectively or not :-) ), instead of learning the rules of the borrow checker to write programs that leverage it – as you put it, effectively writing programs that pass it.
The “hands-on” approach of a lot of available materials is based on fighting the compiler until you come up with something that works, instead of showing how to build a model structured around borrow-checking from the very beginning. It really pissed me off when I was learning Rust. It’s very difficult to follow, like teaching dynamic memory allocation in C by starting with nothing but null pointers and gradually mallocing and freeing memory until the program stops segfaulting and leaking memory. And it’s really counterproductive: at the end of the day all you’ve learned is how to fix yet another weird cornercase, instead of gaining more fundamental insight into building models that don’t exhibit it.
I hope this will slowly go out of fashion as the Rust community grows beyond its die-hard fan base. I understand why a lot of material from the “current era” of Rust development is structured like this, because I saw it with Common Lisp, too. It’s hard to teach how to build borrow checker-aware models without devoting ample space to explaining its shortcomings, showing alternatives to idioms that the borrow checker just doesn’t deal well with, explaining workarounds for when there’s no way around them and so on. This is not the kind of thing I’d want to cover in a tutorial on my favourite language, either.
I don’t know Zig so I can’t weigh in on the parent question. But with the state of Rust documentation back when I learned it (2020/2021-ish) I am pretty sure there’s no way I could’ve learned how to write Rust programs without ample software development experience. Learning the syntax was pretty easy (prior exposure to functional programming helped :-) ) but learning how to structure my programs was almost completely a self-guided effort. The documentation didn’t cover it too much and asking the community for help was not the most pleasant experience, to put it lightly.
like teaching dynamic memory allocation in C by starting with nothing but null pointers and gradually mallocing and freeing memory until the program stops segfaulting and leaking memory.
That’s a good one! There is a thin line between fearless and thoughtless.
If you like Go, you might like Zig, since both are comparatively simple languages. You can keep all of either language in your head. This means lots of things are not done for you.
Rust is more like Python, both are complicated languages, that do more things for you. It’s unlikely you can keep either one fully in your head, but you can keep enough in your head to be useful.
I think this is why many people compare Rust to C++ and Zig to C. C++ is also a complicated language, I’d say it’s one of the most complicated around. Rust is not as bad as C++ yet, since it hasn’t been around long enough to have loads of cruft. Perhaps the way Rust is structured around backwards compatibility it will find a way to keep the complications reasonable. So far most Rust code-bases have enough in common that you can get along. In C++ you can find code-bases that are not similar enough that they even feel like the same language.
It should also be noted that Zig is a lot younger than Rust, so it’s not entirely clear how far down the complicated path Zig will end up, but I’d guess based on their path so far, they won’t go all in on complicated like Rust and C++.
Well, @matklad is already here, but for me coming from Go and frustrated after some time trying Rust (two times) I was motivated to try Zig by @mitchellhtalking with @kristoff why he chooses Zig for Ghostty (his terminal emulator project), and how it matches with my experience/profile…
… the reason I personally don’t like working too much in Rust, I have written rust, I think as a technology it’s a great language it has great merits but the reason I personally don’t like writing Rust is every project that I read with Rust ends up basically being chase the trade implementation around, it’s like what file is this trait defined, what file is the implementation is, how many implementations are there.. and I feel like I’m just chasing the traits and I don’t find that particularly.. I don’t know, productive I should say, I like languages that you read start on line one you read ten and that’s exactly what happened and so I think Zig’s a good fit …
At this point I felt crazy for even considering Rust. I had accomplished more in 4 days what took me 16 days in Rust. But more importantly, my abstractions were holding up.
Not speaking to the languages at all, but I’d say to choose the more mature language - Rust. Even after learning Rust, I still told people to just learn C++ if the goal was to learn that kind of language. That’s a trickier choice now (C++ vs Rust) because Rust has reached a tipping point in terms of resources, so it’s easier to recommend. Zig is just way too early and it’s still not a stable language, I wouldn’t spend the time on it unless you have a specific interest.
If the goal is just to reduce useless Pull Request by pedant, I’d suggest using language like “C++, D, and Go use control flow mechanism like throw/catch or panic/recover, which can prevent bar() from being called”. I believe it would get the message across without triggering tedious semantic argument about what is the one true definition of “Exception”.
Its easy to have conflicting opinion on whether or not panic/recover is about “exception”, and if it is or not a good practice to use it, but it’s a hard fact that it exists in the language and that it is control flow.
Agreed. It seems like the motivation of the Zig documentation is to show that Zig has no hidden control flow - which is great! Maybe it’s better to avoid comparisons to other languages in that respect. It’s not a zero-sum game, there’s no need for value judgements.
Almost full circle. They reinvented the “password manager”, but with extremely complicated specifications (FIDO2, WebAuthn), especially considering the purpose, a shitload of JS and a pile of legacy crypto on top 🤡
Except that password managers use symmetric cryptography (shared secrets). Passkeys use asymmetric cryptography (public-private key pair), which comes with a lot of security benefits.
I have never really been able to figure out what a passkey is so apologies if this is missing the point, but the password manager I use uses asymmetric cryptography. I haven’t really used 1password or lastpass or any of the popular password managers; do those work differently?
The underlying protocol is WebAuthn, and “passkeys” is basically a mass-market branded version of it which is getting a lot of attention.
The basic idea is that instead of having a password or other type of shared credential, for each site you create a public/private keypair and the site stores the public key as part of your account data. Then when you need to authenticate to the site, they present you with something to sign using the private key, and use the public key on file to check the signature.
The claimed advantages here are:
Does away with credential breaches. If you pop the site’s auth database, all you get are the public keys, which are not usable to sign in.
Phishing resistance: whatever device you’re using knows which keypair goes with which site, and you have to fool it rather than fool the human user. And a key failure mode of password managers – that you can just override the password manager’s refusal and manually copy/paste the password to log into a phish anyway – is a lot more difficult since you’d need to manually conduct the whole cryptographic dance unassisted. You can if you’re sufficiently motivated and knowledgeable, but that’s a pretty high barrier.
Simple end-user workflow: when it integrates into the browser (either as a direct feature or via plugin), it’s a pretty seamless experience, and generally all you do to sign in to a site with passkeys is engage your device’s biometrics (fingerprint reader or face recognition or whatever) and it Just Works™.
The last bit is really an understated killer feature. Non-password-based authentication systems have historically had terrible user experience, bordering on completely unusable for insufficiently-technical people. And public-key cryptography also has historically had terrible user experience (I recall a conference once where the statement “we will drink until PGP makes sense” was uttered, for example). Automating the whole process works wonders for improving the experience.
Does away with credential breaches. If you pop the site’s auth database, all you get are the public keys, which are not usable to sign in.
If everyone uses a password manager, the long and hashed unique passwords also don’t give the attacker anything
Phishing resistance: whatever device you’re using knows which keypair goes with which site, and you have to fool it rather than fool the human user. And a key failure mode of password managers – that you can just override the password manager’s refusal and manually copy/paste the password to log into a phish anyway – is a lot more difficult since you’d need to manually conduct the whole cryptographic dance unassisted. You can if you’re sufficiently motivated and knowledgeable, but that’s a pretty high barrier.
On the other hand, if someone gets your device in their hands, they now have immediate access to all sites. It’s also not trivial to revoke the access - you can’t just use another device to do it, because 1.) how do you which passkeys to revoke and 2.) what stops the attacker from doing this before you do it? But if you have to follow a different process (without passkeys) to revoke passkeys then we are back to that this different process can be used for phishing attacks.
Simple end-user workflow: when it integrates into the browser
And when it doesn’t it becomes much more complicated. And on the implementation site it also becomes more complicated hashing and comparing passwords is a very simple thing to do in comparison.
All in all, a lot of open questions and the question is really if the added complexity does not create more (maybe yet unknown) security issues in combination with human usage than password managers.
Ok, there’s a fundamental misunderstanding of the threat model here.
Passkeys protect against all the actual real world threats. They do not protect against someone with physical access to your hardware, and knowledge of the login/unlock credential of that hardware. But if that is your threat model, passwords also fail.
However, even in the case of that level of compromise, the design of passkeys was intentionally such that the actual key material could not be extracted, so even that level of compromise would not allow for secret use of the credentials because passkeys were still restricted to the host device. Obviously once you add the ability to export credentials now an attacker with this level of access can extract that key material alongside all your passwords.
To go through your points:
On the other hand, if someone gets your device in their hands, they now have immediate access to all sites.
Which they get with your passwords already
you can’t just use another device to do it, because 1.) how do you which passkeys to revoke
All of them, just like all your passwords
2.) what stops the attacker from doing this before you do it?
Nothing, just like your passwords
But if you have to follow a different process (without passkeys) to revoke passkeys then we are back to that this different process can be used for phishing attacks.
Just like passwords.
And on the implementation site it also becomes more complicated hashing and comparing passwords is a very simple thing to do in comparison.
On the other hand, all that “complicated” stuff means:
Your users cannot be man in the middled into account compromise
Your users cannot be social engineered into account compromise
Your users get the above without having to use a separate 2FA gadget
Your site can avoid the need for obnoxious and phishable 2FA hoops in the normal sign in flow for passkey sign in
Your users (those who are technically minded) don’t have to wonder about whether your(*literally every site is functionally anonymous, not specifically you) is actually hashing and storing passwords properly
Again, the problem here is a failure to recognize the actual threat model. Outside of specific (expensive) case the real world problem is not someone stealing you/your user’s hardware and then using the independently discovered passwords from that hardware (because stealing your hardware doesn’t magically bypass your device passwords). But if that is a threat model that matters in your use cases, then passwords are still broken there as well, and the issue is that while objectively more secure and harder to exploit than passwords, passkeys are still possibly not good enough.
The overwhelming attack vector in the real world is phishing, social engineering, malware, and similar attacks, where the goal is to get you (or your system) to - by some mechanism - provide all the required log in credentials for the target site. Something that cannot be done with pass keys, because of the complicated stuff you’re questioning the value of. There is little to no interest in physical access to your devices, and in general the relevant parties would like to not even be in the same countries as their victims, let alone the general vicinity.
The problem of “what if someone has access to my device[s] and the passwords/codes required to unlock them” is outside of the bounds of any of this, because at that point they can access any and all of the data that device ever has. Again, prior to this proposal to support exporting passkeys, passkeys were still superior to passwords because by design your attacker would not be able to copy the key material and so could not just silently reuse your passkeys at their leisure, which they can easily do with passwords, which is to my mind much worse.
They do not protect against someone with physical access to your hardware, and knowledge of the login/unlock credential of that hardware. But if that is your threat model, passwords also fail.
Not if the passwords are in my head only. Which is true for the really important ones.
so even that level of compromise would not allow for secret use of the credentials because passkeys were still restricted to the host device
Yeah, but if I have access to all the passkeys and create new ones, why would I need the “credentials” (i.e. the private key)?
On the other hand, all that “complicated” stuff means:
Your users cannot be man in the middled into account compromise
Yes they can because of the “different process” for recovery. You might claim that this is harder to abuse, but I’m not so sure. It might turn out it’s not.
Your users cannot be social engineered into account compromise
See above
Your users get the above without having to use a separate 2FA gadget
And the attacker only needs one device instead of two now. Just like with a password manager.
Your site can avoid the need for obnoxious and phishable 2FA hoops in the normal sign in flow for passkey sign in
Your users (those who are technically minded) don’t have to wonder about whether your(*literally every site is functionally anonymous, not specifically you) is actually hashing and storing passwords properly
That last part is really the only thing that is really positive - it’s guaranteed that an attacker can’t access the service even if they are able to get a copy of the database of the service. With passwords, the service needs to have the password hashed and not in plaintext.
Not if the passwords are in my head only. Which is true for the really important ones.
Then, even if they are good, they’re just as susceptible to phishing to a password manager, only a password manager is not as easily fooled by online phishing scams as people are. Neither really has meaningful defense against multi step social engineering.
Yeah, but if I have access to all the passkeys and create new ones, why would I need the “credentials” (i.e. the private key)?
Because passkeys are not passwords - they’re a key that used for a cryptographic operation. Signing in is not a matter of vending a secret to the host. The host service transmits a nonce, the browser takes that nonce, looks up the passkey for service it has connected to over TLS, if it finds one, and then you approve it (via some local authentication mechanism - i.e. malware compromising the device cannot arbitrarily perform these operations, that’s part of the reason for the hsm involvement) - then the passkey implementation’s hsm signs a response that includes the server provided nonce, and the browser sends that back to the server. Which validates your current session.
So having access to your device, and the ability to unlock your device, an attacker can sign into a service that uses a passkey - just like they could with a password manager. But now, this person - who remember, we’ve already declared has your device passwords - can copy all of your passwords off your device to use at their leisure. To abuse their current compromise of your device they would have to add an additional passkey (something you can and should be notified about, and you can see the list of valid keys).
And the attacker only needs one device instead of two now. Just like with a password manager.
The attacker doesn’t need any of your devices, that’s literally why phishing works. 2FA increases the cost of compromise, but in your threat model your accounts are valuable enough for them to be working towards and successfully getting:
Physical access to your device[s]
Your device passwords/passcodes
So the costs of intercepting the standard 2FA schemes is not a major issue, more over they can extract the state from any of your 2FA apps given they have access to your 2fa apps and device passwords they can use those to add their own devices to your 2fa mechanisms.
That last part is really the only thing that is really positive - it’s guaranteed that an attacker can’t access the service even if they are able to get a copy of the database of the service. With passwords, the service needs to have the password hashed and not in plaintext.
You seem to not be understanding the attack vectors here.
The reason for password hashing is not to protect your own service. At the point your servers are compromised to the point that an attacker has access to your authentication database, your service, and all of the accounts it hosts are completely compromised. The reason for password hashing is so that an attacker cannot take the passwords from your server, and then try them on other services. Run by other people.
Password hashing is not a method to protect your service, or to protect your users on your service. It is solely to protect your users against those secondary attacks, which do not otherwise involve you at all.
Passkeys render that problem moot, not because there’s no password to leak, but because the client will never use a passkey from Service A to authenticate against anything that cannot prove that it is itself Service A. So authentication is not relayable (and so phishing and spoofing are not possible), and that any given authentication response is not replayable, so a moment in time compromise does not provide data that allows the attacker to authenticate again in future.
The purpose of passkeys is to secure users against actual real world attack vectors, rather than hypothetical attacks, especially ones as extreme as what you are suggesting. They are more secure than passwords in every practical sense, they are more secure than any multi factor scheme.
They are not robust against your proposal of a person who has gained physical access to your hardware, and who has the ability to authenticate themselves as you on that hardware. But that vector is even more catastrophic for passwords, including those that are in your head, because at that point your attack vectors include malware, key loggers, malicious extensions, etc which mean your passwords can be intercepted, and then used at will by the attacker.
Not sure if that’s worth all the disadvantages…
That’s your call, but I’m concerned that you are making that call without an accurate threat model (your go to model is a local hardware attack by someone with you device passwords), nor an accurate model of the complexity, like I suspect you have no problem with your site using TLS, despite TLS being vastly more complex.
Again: you either lose full access once you lose your passkey(s) - or you have a recovery flow and now a phisher does not need access to the device but rather to the information in the recovery flow.
You seem to not be understanding the attack vectors here. The reason for password hashing is not to protect your own service.
Honestly, instead of making assumptions, maybe try to understand what I’m saying.
And yes, my threat model might be different than yours. I see anything between me and a service as a potential threat and that includes my hardware, my OS and so on.
And yes, I do have my doubts about TLS and let’s not pretend there haven’t be major security issues with those certificates before.
Not if the passwords are in my head only. Which is true for the really important ones.
You’re doing something 99% of users never will.
And the attacker can’t open my device, and thus can’t access my passkeys, because they would need to know my pin code (that’s in my head), or use physical violence to get me to put my thumb on the fingerprint scanner.
If everyone uses a password manager, the long and hashed unique passwords also don’t give the attacker anything
If you can get 100% of people to follow 100% perfect security hygiene 100% of the time, sure, technical improvements in the nature of the credentials don’t matter. I wish you luck in your attempt. One thing passkeys and other non-password-based systems try to address is the fact that we’ve never succeeded at this and so maybe “fix the people” isn’t going to be the solution.
On the other hand, if someone gets your device in their hands, they now have immediate access to all sites.
The mobile-device implementations I’ve seen rely on the mobile device’s biometrics to gate access to passkeys. Same for desktop implementations – lots of computers have a fingerprint reader, for example. And lots of people already set up that stuff for convenience, which means you don’t have to change user behavior to get them to do it.
Besides, someone getting their hands on your device is an immediate breach of your password manager, too, and if you want to argue the password manager should be locked by a password or biometrics, well, passkeys should be too, so password managers offer no improvement.
And when it doesn’t it becomes much more complicated. And on the implementation site it also becomes more complicated hashing and comparing passwords is a very simple thing to do in comparison.
“Simple” hashing and comparing of passwords is anything but.
At any rate, it seems pretty clear that you are not really open to having your mind changed and that no amount of argument or evidence would change it, so I’m going to drop out of this conversation.
If you can get 100% of people to follow 100% perfect security hygiene 100% of the time, sure, technical improvements in the nature of the credentials don’t matter.
The same is true for passkeys. If no one uses them, they don’t help. Or do you mean forcing everyone to use passkeys? Yeah, well, great. And sure, you can’t force people to use password managers.
The mobile-device implementations I’ve seen rely on the mobile device’s biometrics to gate access to passkeys
And is that guaranteed to be secure? Can a Scammer not fake your fingerprint with all those devices? As far as I know the devices can usually be tricked pretty easily.
Besides, someone getting their hands on your device is an immediate breach of your password manager, too
That’s actually true and it is the reason why I’d like to see a password manager that limits the number of accesses over time. Ofc. for that to work it can’t run locally.
“Simple” hashing and comparing of passwords is anything but.
But the shared secret (password) is still symmetric cryptography.
This matters because with passkeys, some bad actor who controls the website for a few seconds can extract your token and use it for a while. But with a password, it can extract the password and then use this to generate new tokens.
(there are ways around this even with symmetric cryptography such as hash-chains, but it’s not really used in the wild)
That really only makes sense as a rebuttal if there are significant advantages. The password manager workflow is that I create a secret key which I store in a secure key store, I give that key to the website, and then the website gives my client a temporary access token when I provide my key. That temporary access token is then used for day-to-day authentication, requiring renewal using the secret key every now and then or when authenticating a new device.
This, honestly, is a pretty great system. What’s more, it’s universally supported everywhere already. What benefit does passkeys actually provide compared to this?
They are not vulnerable to interception. If your browser is compromised or some part of the connection is insecure, for example, and you use a password then that password is visible as plain text. An attacker can copy it and then later, from another machine, log into the service. In contrast, with a pass key the attacker sees only the challenge-response for the current session. They cannot replay that later.
They are not vulnerable to leakage if the server does not store them securely. A lot of systems over the past decades have been compromised and had password databases leak. Some of these were unhashed and so were complete compromises. If you reused a password, other systems may be compromised with it. Even with hashing, older hash schemes are vulnerable to brute force attacks and so attackers can find passwords. With a pass key, the server holds only the public key. Exfiltrating this does not help the attacker (unless they have a quantum computer, but it’s easy to move passkeys to PQC before this becomes a realistic threat).
They cannot be reused. You may have a single secret, but this is used with the remote host name in a key derivation function to provide a unique private key for each service. This prevents impersonation attacks because your key for service X and your key for service Y are guaranteed to be different. If you go to a phishing site and it forwards a challenge from another site, you will sign it with the wrong key and the login will not work.
If Apple’s sync service is compromised, passkeys can leak, but this would have to happen at the point when you set up a new device. The secure elements on the sync’d devices perform a key exchange protocol via iCloud, and with iCloud providing some attestation as to device identity when you enable syncing. The exchanged key is never visible to another device (iCloud can’t intercept it, but it could make your source device do the key exchange with the wrong target. This is similar to a TOS handshake, where iCloud is the certificate signing authority). After this point, iCloud only ever sees cypher text for wrapped keys and so can’t decrypt them even if it is completely compromised (though it could trick users into initiating the sync flow to an untrusted device). I believe the secure elements also have a certificate that’s baked in at device provisioning time so you’d need to also compromise that if you wanted to sync keys to a device that let you extract them, rather than to another Apple device.
What benefit does passkeys actually provide compared to this?
For one, this system does not allow a central entity to monitor and disable someone’s access to all their services at once. (you didn’t specify who the benefit should apply to…)
I relate a lot to the state synchronization issue. I migrated a project from Go+Templ+HTMX to Go+OpenAPI+React (no SSR) because of this.
HATEOAS is a nice idea, but when your application has a lot of client-side interactivity and local state, it becomes (at least for me) to keep a clean codebase that do not turn in a spaghetti monstrosity.
I mean, “state synchronization” is (IMO) the whole problem React is intended to solve. So when folks balk at the way that React work and advocate for a stripped down lib, my question is always, “okay, so how are you planning on solving it?”
React solves state synchronization on the client-side, but not between the server and the client. Actually, this second part often becomes more difficult as one adds more interactivity client-side. That’s what leading some teams (for example linear.app) to treat the sync problem at the database level, and replicate the database between the server and client. Then the React client becomes a “simple” view on top of the database replica.
This is really neat - I hadn’t seen it before, it looks like the page was created 14th of September.
The protocol is for the replica to send a cryptographic hash of each of its pages over to the origin side, then the origin sends back the complete content of any page for which the hash does not match.
Something I’ve worried about in the past is that if I want to make a point-in-time backup of a SQLite database I need enough disk space to entirely duplicate the current database first (using the backup mechanism). This tool fixes that - I don’t need any extra disk space at all, since the pages that have been updated will be transmitted directly over the wire in 4096 byte chunks.
Hm, that seems inefficient. Assuming it uses 256-bit hashes and 4KB pages, that’s roughly 1% of the total database size as overhead, even before it starts sending any changed pages.
I would have done this using something like a Prolly tree, or Aljoscha Meyers range-based set reconciliation, which only have to send O(log n) probes to find the differences. They can also detect a no-op (no differences) by sending just one hash.
Even in the years before that, it was Rails convention to pass Hashes and pretend to have keyword arguments that way. (By Ruby 2.x, keyword arguments were supported with native syntax.)
You’re correct, but as I mentioned in the post, Python has no special syntax for specifying a keyword argument. You can define keyword-only arguments, but every positional argument can also be specified as a keyword argument when you call it.
The difference is in defaults and flexibility: obviously in Ruby you can define every method you personally write as a keyword argument, and I’m sure that there’s a cop you can enable to enfore that, but Python does the right thing by default. If the codebase at work were written in Ruby instead of Python, I don’t think I could rely on past colleagues to have defined methods that take 7 arguments using keywords, but since it’s Python I don’t have to worry about it.
every positional argument can also be specified as a keyword argument when you call it
Yes, Python is super flexible in that regard, but I often find that the interaction of positional arguments, keyword arguments, default values, and varargs, lead to some overly complex APIs. I’ve been guilty of abusing it myself. I can see why some languages like Zig are avoiding this.
I think it’s valid to conclude that the keyword arguments experiences of Python and Ruby are still very different and that you might long for the Python one (as someone who trod a very similar path to the article author, using Ruby intensely for over a decade before having more to do with Python again). You can’t throw a kwargs-style hash at a Ruby method if it’s expecting positional arguments, even if those positional arguments do have names. But you can of course use the x(y=z) syntax; there’s just no way to specify y programmatically. The confusion of both is unfortunate, and let’s not forget that period in-between where there was all that munging of kwargs.
I find over and over again that any system that models the world as intercommunicating tasks and has a designer who really works out all the details for robustness… tends to look a lot like Erlang.
Is it possible to self-host Deno Deploy, or are you lock-in with their platform?
As far as I can remember, anything you can do on Deno Deploy also works with the Deno binary. The KV database is backed by SQLite when running that way, but you can also self-host a FoundationDB-based version (not sure if this is the same one that powers Deploy).
You’re essentially locked in. You could build all the infra yourself and run your Deno code in the open-source runtime, but that’s lock-in to me.
The nice things about Deno IMO aren’t secret, patent-encumbered enterprise features; they’re just existing ideas brought together and executed well, something I believe the open source world is fully capable of doing :)
I thing that’s a fair take. A few comments:
Rust has const. While not as powerful as comptime, it is getting there with features such as references to globals (const_refs_static).
I like Rust and not Zig, but is it really? AFAIK Rust has no intention of adding a similar ability to generate types at compile-time.
I actually love the name. Celery with type safety is broccoli, because it’s better and safer for you!
I see you plan to support Kafka and Rabbit. Both very sensible choices given you already support Redis. If you’re looking to get adoption from the bottom-up instead of top-down, you might consider choosing postgres queue technology for all of us average-scalers since Redis already covers high-scalers.
I agree !
PG as a broker is amazing from my limited experience (with python/procrastinate): with 10/20 million jobs a day at work it does not break a sweat and it makes observability, debuging and testing so much more easier than with celery/rabbitmq. I will not use anything else unless forced to :D
Same here. I don’t understand why people insist on using Redis or Rabbit as a store for job queues when most projects already use PostgreSQL or MySQL. Also, I can rely on transactional semantics when queuing jobs if they are stored in the same SQL database, which is a significant simplification, if I want to ensure that I’m never losing jobs.
I’ve had a lot of luck, on many projects, using a database (Usually PostgreSQL, MySQL in the past) as a mid-high latency job dispatch mechanism. It’s not a replacement for a message queue but it’s rare for people to actually desire/need the trade-offs of a message queue. Usually it’s just a job queue.
Back in ~2011/2012 I was juggling over a billion jobs (each an individual row) in a single jobs table on a MySQL RDS instance. It works fine if you know what you’re doing and if you don’t, well, MQs aren’t any more merciful than databases in liminal circumstances.
Hi adriano,
I love the name too :). I think that PG queues are very interesting and promising however I believe that interacting with PG would require a different approach than how I originally built the lib. However, I am open to contributions and if you want to try to post a draft PR for adding this in.
Hey, I’m just here to create more work, not do the work :)
I actually am not a rust user. I just like to spread the postgres-as-a-queue gospel around the internet when the opportunity arises. Single-datastore applications are a joy to work with.
Rust moving up so much and so quickly is definitely quite frustrating.
min-ver‘s biggest advantage is that it puts the brakes on by default and I think this is a good thing. It is very deliberate in saying that you move up intentionally. I think it’s the better default for future ecosystems.Would it be possible for Rust/Cargo to switch from max-ver to min-ver?
Since the resolver can be changed between editions it could be possible but the interactions between old and new crates would be very confusing. Generally though I think that the rust community does not like min-ver.
It could be an option for the package/workspace, right next to
resolver =So many people talking about how bad Kubernetes is without any experience with it. I’m pretty sure many people also tried to host it themselves and got bitten so badly they add on top of it.
I’ve used managed Kubernetes for many projects and they just run fine. It’s actually a relief to write down what you need for your app to run and it gets done.
I mean, you have a learning curve, but many things you want are just present by default without you having to care about them. It’s a platform, you learn it once and you just have to use it then (I’m not talking about cluster operators etc. here).
Some people in the comments mention monitoring being an issue. I find that’s quite the opposite. You just have everything managed by default, you deploy a DS or a CRD with your monitoring stack you are sure that everything will be monitored. You won’t get the usual reply from monitoring team like “oh sorry the log file permission was wrong when we updated the box, sorry you weren’t paged for that incident”.
In my experience, k8s exemplifies the Turing Tarpit: everything is possible, nothing of interest is easy.
I’ve run into pretty major footguns for each of those examples, and I have plenty more war stories where that came from.
Oops, you wanted more than one process to have write access? Tough luck, you’ll have to use some kind of NFS mount instead
We had a couple teams inadvertently open up unauthenticated access to their apps due to a single-character typo in their YAML. whoops
Oh wait, you want that integrated with your platforms Secrets Management? You’ll need to install a third-party operator for that
Pretty sure that’s as easy as setting a
ReadWriteManyaccess mode?How does this happen (i.e., what was the typo)? Kubernetes ingress doesn’t even do auth out of the box…
That seems… fine? Why would you expect a platform to ship with support for a third-party secrets management platform? It’s not like you get automatic secrets management of any kind if you’re managing your own VMs. You still have to install and software, but instead you have to install and configure it on each individual box, and in the best case you’re still having to deal with Ansible or similar versus Kubernetes manifests.
Not supported by GKE!
Missing
-, which caused the AuthorizationPolicy to interpret what should have been one thing as multipleI’m responding to the parent’s contention:
This is not an accurate description of what it’s like to deploy to a managed k8s system. It’s extremely difficult, time-consuming, and full of footguns. I dread having to make changes to my k8s deployments, because it always surprises me how difficult and tedious it is.
I’m 99% sure we were doing this several years ago with a Filestore storage backend. I set it up–I don’t remember it being particularly troublesome.
Was the
-a YAML list-item syntax designator? I’m not familiar with AuthorizationPolicies; I don’t think that’s part of Kubernetes or even GKE. Maybe some Kubernetes custom resource definition for some other Google cloud offering, but it feels odd to fault Kubernetes or GKE for something that isn’t part of either.I think no matter what you do it’s difficult and tedious. If you’re managing VMs or hosts, you have pretty similar problems, but you end up building much of what Kubernetes is in your own bespoke way and your organization won’t be able to benefit from all of the documentation, training materials, and hiring pool that you get from using a relatively standard thing like Kubernetes. 🤷♂️
Filestore is NFS! And expensive as hell.
Multiple processes writing to the same storage is a thing computers have been doing longer than I’ve been alive!
I’m not saying you can’t work around this, I’m saying Kubernetes makes this harder.
I’m responding to:
It’s not that simple! You have to bring your own authn/authz layer, which is significantly more complicated than a reverse proxy.
Kubernetes doesn’t solve this problem! There are so many, many ways to do everything, your k8s cluster will be a special snowflake. You’ll constantly dig up documentation that conflicts with your situation because you didn’t do it exactly that way (cf. my AuthorizationPolicy bug—you haven’t run into this because we’re doing authn differently)
Yes, but you don’t really need to know or care that it’s NFS. And yes, it is expensive.
You can do that on any Kubernetes distro without special integration, but you’ll be constrained in exactly the same way as the computers of olde. If you want networked storage, you’ll need to use a networked container storage interface driver.
I’m not sure this is true. If all you want is multiple processes writing to the same storage, just fork two processes in your container and have them write to the same storage. I suspect you mean multiple processes on different machines writing to the same storage, in which case you’ll need to configure a networked container storage interface just like you would have to set up a networked file system in the sans-Kubernetes case.
This confuses me–you need an authn/authz layer regardless of whether you’re using Kubernetes, right?
Kubernetes is extensible, and to the extent that you extend it, it will be a special snowflake, but the default stuff is bog standard. For example, replicasets, deployments, jobs, cronjobs, services, secrets, configmaps, etc are all standard with kubernetes, and they’re all things you would have to build yourself for most real world applications. Your cluster won’t be exactly like another’s, but it will be a lot more similar than any two VM-based systems.
I agree. The API and tools for k8s users are quite nice and comprehensive. I think that what people are criticizing is that k8s is quite difficult to deploy and maintain, and it doesn’t scale down well to single-host systems, which can be useful for local dev, learning, and small deployments.
Maybe I’m just doing it wrong, but it often feels like managing native hosts is a lot more painful, at least if you want to do it “right” rather than just SSH-ing on and making manual changes. For my homelab Raspberry Pi cluster, I have to do some minimal host stuff to get my nodes to run Kubernetes, and from there I just set a few secrets and apply all of the manifests and things kind of just work. Recently I rebuilt my cluster from scratch and it was pretty easy.
Can you elaborate on this? I do most of my local dev on Docker’s Kubernetes integration. I haven’t noticed any problems.
This is a falsehood some people in the intersection of the data oriented design and C crowd love to sell. RAII works fine with batches, it’s just the RAII object is the batch instead of the elements inside. Even if the individual elements have destructors, if you have an alternative implementation for batches C++ has all the tools to avoid the automatic destructor calls, just placement new into a char buffer and then you can run whatever batch logic instead. Don’t try bringing this up with any of them though or you’ll get an instant block.
I do highly performance sensitive latency work with tighter deadlines than gamedev and still use destructors for all the long lived object cleanup. For objects that are churned aggressively avoiding destructor calls isn’t any different than avoiding any other method call.
Agreed, this post is making some wild claims that don’t hold up in my experience. I’m writing a high-performance compiler in Rust, and most state exists as plain-old datatypes in re-usable memory arenas that are freed at the end of execution. RAII is not involved in the hot phase of the compiler. Neither are any smart pointers or linked lists.
I simply find the argument unconvincing. Visual Studio has performance problems related to destructors => RAII causes slow software?
Agreed. I like Zig and appreciate Loris’ work, but I don’t understand this argument as well.
“Exists in tension” seems accurate to me. Yes, you can do batches with RAII, but in practice RAII languages lead to ecosystems and conventions that make it difficult. The majority of Rust crates use standard library containers and provide no fine grained control over their allocation. You could imagine a Rust where allocators were always passed around, but RAII would still constrain things because batching to change deallocation patterns would require changing types. I think the flexibility (and pitfalls) of Zig’s comptime duck typing vs. Rust traits is sort of analogous to the situation with no RAII vs. RAII.
I think it’s the case that library interfaces tend not to hand control of allocations to the caller but I think that’s because there’s almost never pressure to do so. When I’ve wanted this I’ve just forked or submitted patches to allow me to do so and it’s been pretty trivial.
Similarly, most libraries that use a HashMap do not expose a way to pick the hash algorithm. This is a bummer because I expect the use of siphash to cause way more performance problems than deallocations. And so I just submit PRs.
Yes. I write Zig every day, and yet it feels like a big miss, and, idk, populist? “But don’t just take my word for it.” Feels like too much trying to do ‘convincing’ as opposed to elucidating something neat. (But I guess this is kind of the entire sphere it’s written in; what does the “Rust/Linux Drama” need? Clearly, another contender!)
To be fair, invalidating this specific argument against RAII does not invalidate the entire post.
You write Zig every day? What kind of program are you working on?
It doesn’t, but without it I don’t really see the post offering anything other than contention for the sake of marketing.
I spend somewhere between 2 to 8 hours a day working on my own projects. (“2” on days I also do paid work, but that’s only two days a week.) Zig has been my language of choice for four or five years now; you can see a list on my GitHub profile. A lot of my recent work with it is private.
Impressive commitment to Zig! Thanks for sharing.
Thank you! I really like it, and I’m a little sad that Rust — which I still use often, maintain FOSS software in, and advocate for happily! — has narrowed the conversation around lower-level general-purpose programming languages in a direction where many now reject out of hand anything without language-enforced memory safety. It’s a really nice thing to have, and Rust is often a great choice, but I don’t love how dogmatic the discourse can be at the expense of new ideas and ways of thinking.
I very much agree. A Zig program written in a data-oriented programming style, where most objects are referenced using indices into large arrays (potentially associated to a generation number) should be mostly memory safe. But I haven’t written enough Zig to confirm this intuition.
I don’t remember the arguments against RAII much (has been a few years since) but that Zig doesn’t have RAII feels like an odd omission given the rest of the language design. It’s somewhat puzzling to me.
Hm, it’s pretty clear RAII goes against the design of Zig. It could be argued that it’d be a good tradeoff still, but it definitely goes against the grain.
__deinit__function if available” would the the sole place where that sort of thing would be happeningdefer fits Zig very well, RAII not at all.
I was unaware that Zig discourages holding on to the allocator. I did not spend enough time with Zig but for instance if you have an
ArrayListyou can defer.deinit()and it will work just fine. So I was assuming that this pattern:Could be turned into something more implicit like
I understand that “hidden control flow” is something that zig advertises itself against, but at the end of the day
deferis already something that makes this slightly harder to understand. I do understand that this is something that the language opted against but it still feels odd to me that no real attempt was made (seemingly?) to avoiddefer.But it very much sounds like that this pattern is on the way out anyways.
Zig’s
std.HashMapfamily stores a per-collection allocator inside the struct that is passed in exactly once through theinitmethod. Idk how that can be considered non-idiomatic if it’s part of the standard library.It is getting removed! https://github.com/ziglang/zig/pull/22087
Zig is a pre 1.0 language. Code in stdlib is not necessary idiomatic both because there’s still idiom churn, and because it was not uniformly audited for code quality.
As someone who doesn’t use Zig or follow it closely, both the fact that that change is being made and the reason behind it are really interesting. Thanks for sharing it here
You might also like https://matklad.github.io/2020/12/28/csdi.html then, as a generalization of what’s happening with Zig collections.
That’s an interesting development. Thanks for informing me!
It would completely change the design of the language and its approach to memory and resource management.
I’ve never used placement
new, so I don’t know about that, so my question is, how do you do that? Take for instance a simple case where I need a destructor:If I have a bunch of elements that are both constructed at the same time, then later destroyed at the same time, I can imagine having a dedicated
Element_listclass for this, but never having used placementnew, I don’t know right now how I would batch the allocations and deallocations.And what if my elements are constructed at different times, but then later destroyed at the same time? How could we make that work?
I think I have an idea about their perspective. I’ve never done Rust, but I do have about 15 years of C++ experience. Not once in my career have I seen a placement
new. Not in my own code, not in my colleagues’ code, not in any code I have ever looked at. I know it’s a thing when someone mentions it, but that’s about it. As far as I am concerned it’s just one of the many obscure corners of C++. Now imagine you go to someone like me, and tell them to “just placement new” like it’s a beginner technique everyone ought to have learned in their first year of C++.I don’t expect this to go down very well, especially if you start calling out skill issues explicitly.
I’m a little bit surprised, because I’ve had the opposite experience. Systems programming in C++ uses placement new all of the time, because it’s the way that you integrate with custom allocators.
In C++, there are four steps to creating and destroying an object:
When you use the default
newordeleteoperators, you’re doing two of these: first calling the globalnew, which returns a pointer to some memory (or throws an exception if allocation fails) and then calling the constructor, then calling the destructor. Bothnewanddeleteare simply operators that can be overloaded, so you can provide your own, either globally, globally for some overload, or per class.Placement new has weird syntax, but is conceptually simple. When you do
new SomeClass(...), you’re actually writingnew ({arguments to new}) SomeClass({arguments to SomeClass's constructor}). You can overloadnewbased on the types of the arguments passed to it. Placement new is a special variant that takes avoid*and doesn’t do anything (it’s the identity function). When you donew (somePointer) SomeClass(Args...), wheresomePointeris an existing allocation, the placement new simply returnssomePointer. It’s up to you to ensure that you have space here.If you want to allocate memory with
mallocin C++ and construct an object in it, you’d write something like this (not exactly like this, because this will leak memory if the constructor throws):This separates the allocation and construction: you’re calling
mallocto allocate the object and then calling placement new to call the constructor and change the type of the underlying memory toT.Similarly, you can separate the destruction and deallocation like this (same exception-safety warning applies):
In your example,
std::unique_ptrhas a destructor that callsdelete. This may be the global delete, or it may be somedeleteprovided byFoo,Bar, orBaz.If you’re doing placement new, you can still use
std::unique_ptr, but you must pass a custom deleter. This can call the destructor but not reclaim the memory. For example, you could allocate space for all three of the objects in your ‘object’ with a single allocation and use a custom deleter that didn’t free the memory instd::unique_ptr.Most of the standard collection types take an allocator as a template argument, which makes it possible to abstract over these things, in theory (in practice, the allocator APIs are not well designed).
LLVM does arena allocation by providing making some classes constructors private and exposing them with factory methods on the object that owns the memory. This does bump allocation and then does placement new. You just ‘leak’ the objects created this way, they’re collected when the parent object is destroyed.
Thanks for the explanation, that helps a ton.
I’ve done very little systems programming in C++. Almost all the C++ code I have worked with was application code, and even the “system” portion hardly did any system call. Also, most C++ programs I’ve worked with would have been better of using a garbage collected language, but that wasn’t my choice.
This may explain the differences in our experiences.
Yup, that’s a very different experience. Most C++ application code I’ve seen would be better in Java, Objective-C, C#, or one of a dozen other languages. It’s a good systems language, it’s a mediocre application language.
For use in a kernel, or writing a memory allocator, GC, or language runtime, C++ is pretty nice. It’s far better than C and I think the tradeoff relative to Rust is complicated. For writing applications, it’s just about usable but very rarely the best choice. Most of the time I use C++ in userspace, I use it because Sol3 lets me easily expose things to Lua.
I think it very much also depends on the subset of C++ you’re working with, at a former job I worked on a server application that might have worked in Java with some pains (interfacing with C libs quite a bit), and in (2020?) or later it should have probably be done in Rust but it was just slightly older that Rust had gained… traction or 1.0 release. It was (or still is, probably) written in the most high-level Java-like C++ I’ve ever seen due to extensive use of Qt and smart pointers. I’m not saying we never had segfaults or memory problems but not nearly as many as I would have expected. But yeah, I think I’ve never even heard about this placement new thing (reading up now), but I’m also not calling myself a C++ programmer.
Placement new is half the story, you also need to be aware that you can invoke destructors explicitly.
A trivial example looks like
If you want to defer the construction of multiple foos but have a single allocation you can imagine
char foos_storage[sizeof(foo)*10]and looping to call the destructors. Of course you can heap allocate the storage too.However, you mostly don’t do this because if you looking for something that keeps a list of elements and uses placement new to batch allocation/deallocation that’s just
std::vector<element>.Likewise if I wanted to batch the allocation of Foo Bar and Baz in Element I probably would just make them normal members.
Each element and its members is now a single allocation and you can stick a bunch of them in a vector for more batching.
If you want to defer the initialization of the members but not the allocation you can use
std::optionalto not need to deal with the nitty gritty of placement new and explicitly calling the destructor.IME placement new comes up implementing containers and basically not much otherwise.
Note that since C++20+ you should rather use std::construct_at and std::destroy_at since these don’t require spelling the type and can be used inside constexpr contexts.
You likely use placement new every day indirectly without realizing it, it’s used by std::vector and other container implementations.
When you write
new T(arg)two things happen, the memory is allocated and the constructor runs. All placement new does is let you skip the memory allocation and instead run the constructor on memory you provide. The syntax is a little weirdnew(pointer) T(arg). But that’s it! That will create aTat the address stored inpointer, and it will return aT*pointing to the same address (but it will be aT*whereaspointerwas probablyvoid*orchar*). Without this technique, you can’t implement std::vector, because you need to be able to allocate room for an array of T without constructing the T right away since there’s a difference between size and capacity. Later to destroy the item you do the reverse, you call the destructor manuallyfoo->~T(), then deallocate the memory. When you clear a vector it runs the destructors one by one but then gives the memory back all at once with a single free/delete. If you had a type that you wanted to be able to do a sort of batch destruction on (maybe the destructor does some work that you can SIMD’ify), you’d need to make your own function and call it with the array instead of the individual destructors, then free the memory as normal.I’m not trying to call anybody out for having a skill issue, but I am calling out people who are saying it’s necessary to abandon the language to deal with one pattern without actually knowing what facilities the language provides.
What would this look like in practice? How do you avoid shooting yourself in the foot due to a custom destructor? Is there a known pattern here?
There are different ways you could do it but one way would be to have a template that you specialize for arrays of T, where the default implementation does one by one destruction and the specialization does the batch version. You could also override regular operator delete to not have an implementation to force people to remember to use a special function.
Ambitious project. Great post. Why Zig for this project?
It wasn’t just me who double taked the title and didn’t see an actual answer? Comptime was mentioned… but that’s all I saw.
I’ve responded to the parent comment, but you didn’t miss much.
comptime, and the ability to easily try out different allocation strategies, was all.I just needed a way to have arbitrary AST query syntax with (near) zero runtime overhead. And I’m more familiar with Zig (or C++) than Rust.
Thank you.
To be honest, any language with manual memory management would’ve worked out just fine. Between C++, Rust, and Zig, I find that Zig has the most ergonomic way of swapping allocation schemes at will.
That, and the
comptimebit I mentioned in the post lets me easily write AST queries (which I something I expect plugin authors to be doing a lot of).I agree that Zig is really great at giving explicit control over memory allocation.
Could you point out an example in the source code of how you use comptime to simplify writing AST queries? I had a quick look but missed that.
I’m a bit surprised the article doesn’t mention the main issue with using threads for a high number of concurrent tasks: the memory used by each thread for its call stack.
Memory usage is not the main issue with threads. Memory usage of Go’s goroutines and threads are not that different. I think it’s like 4x-8x difference? Which is not small, of course, but given that memory for threads is only fraction of memory the app uses, it’s actually not that large in absolute terms. You need comparable amount of memory for buffers for TCP sockets and such.
As far as I can tell, the actual practical limiting factor for threads is that modern OSes, in default configuration, just don’t allow you to have many threads. I can get to a million threads on my Linux box if I sudo tweak it (I think? Don’t remember if I got to a million actually).
But if I don’t do OS-level tweaking, I’ll get errors around 10k threads.
Not touching on the rest of the comment because it’s not something I have extensive experience with, but I do want to point out that Go isn’t really the best comparison point since goroutines are green threads with growable stacks, so their memory usage is going to be lower than native threads. Any discussion about memory usage of threads probably also needs to account for overcommit of virtual vs resident memory.
All of this is moot in Rust due futures being stackless, so my understanding (I reserve the right to be incorrect) is that in theory they should always use less memory than a stackfull application.
That’s is precisely my point: memory efficiency is an argument for stackless coroutines over stackful coroutines, but it is not an argument for async io (in whichever form) over threads.
Sorry for reading and replying to your comment so late. The minimum stack size of a goroutine is 2 kB. The default stack size of a thread on Linux is often 8 MB. Of course, the stack size of most goroutines will be higher. And similarly, it is usually possible to reduce the stack size of a thread to 1 MB or less if we can guarantee the program will never need more. Is it how you concluded that the difference was somewhere around 4x-8x?
I like your point about the fact that the memory for buffers, usually allocated on the heap, should be the same regardless. Never thought of that :)
The stack size of a thread can be just a few kilobytes on Linux since the pages don’t actually get mapped until accessed.
I know, but I’ve always wondered what happens when a program has hundreds of thousands of threads. Will the TLB become too large with a lot of TLB miss making the program slow? When the stack of a thread grows from 4 kB to 8kB, how is that mapped to physical memory? Does it mean there will 2 entries in the TLB, one mapping the first 4 kB, and another the second 4 kB? Or will the system allocate a contiguous segment of 8 kB, and copy the first 4 kB to the new memory segment? I have no idea how it works concretely. But I would expect these implementation “details” to impact performance when the number of threads is very large.
I did some reading and will try to answer my own questions :)
Q: Will the TLB become too large with a lot of TLB miss making the program slow?
A: The TLB is a cache and has a fixed size. So no, the TLB can’t become “too large”. But if the working set of pages becomes too large for the TLB, then yes there will be cache misses, causing TLB thrashing, and making the program slow.
Q: When the stack of a thread grows from 4 kB to 8kB, how is that mapped to physical memory?
The virtual pages are mapped to physical pages on demand, page per page.
Q: Does it mean there will 2 entries in the TLB, one mapping the first 4 kB, and another the second 4 kB?
A: Yes. At least this is the default on Linux, as far as I understand.
Q: Or will the system allocate a contiguous segment of 8 kB, and copy the first 4 kB to the new memory segment
No.
Q: I would expect these implementation “details” to impact performance when the number of threads is very large.
A: If the stacks are small (a few kB), then memory mapping and TLB thrashing should not be a problem.
It’s 8 megs of virtual memory. Physically, only a couple of pages will be mapped. A program that spawns a million threads will use dozens of megs not 8 gigs of RAM.
Correct, I keep forgetting about this. But assuming that each thread maps at least a 4 kB page, and that the program spawns a million threads, then it should use 1 million x 4 kB = 4 GB, and not dozens of megs? Or am I missing something?
typo, meant to say thousand! But I guess you could flip that around and say that dozen of gigs is enough for million threads, not for a thousand!
I like that: “a dozen of gigs are enough for a million threads” :)
The stack size isn’t a problem at all. Threads use virtual memory for their stacks, meaning that if the stack size is e.g. 8 MiB, that amount isn’t committed until it’s actually needed. In other words, a thread that only peaks at 1 MiB of stack space will only need 1 MiB of physical memory.
Virtual address space in turn is plentiful. I don’t fully remember what the exact limit is on 64 bits Linux, but I believe it was somewhere around 120-something TiB. Assuming the default stack size of 8 MiB of virtual memory and a limit of 100 TiB, the maximum number of threads you can have is 13 107 200.
The default size is usually also way too much for what most programs need, and I suspect most will be fine with a more restricted size such as 1 MiB, at which point you can now have 104 857 600 threads.
Of course, if the amount of committed stack space suddenly spikes to e.g. 2 MiB, your thread will continue to hold on to it until it’s done. This however is also true for any sort of userspace/green threading, unless you use segmented stacks (which introduce their own challenges and problems). In other words, if you need 2 MiB of stack space then it doesn’t matter how clever you are with allocating it, you’re going to need 2 MiB of stack space.
The actual problems you’ll run into when using OS threads are:
sysctlsettings (e.g. most Linux setups will have a default limit of around 32 000 threads per process, requiring asysctlchange to increase that). Some more details hereOf these, context switch costs are the worst because there’s nothing you as a user/developer can do about this, short of spawning fewer OS threads. There also doesn’t appear to be much interest in improving this (at least in the Linux world that I know of), so I doubt it will (unfortunately) improve any time soon.
What is the canonical resource that explains why context switch cost differs between the two? I used to believe that, but I no longer do after seeing
https://github.com/jimblandy/context-switch
And, specifically,
So, currently I think I don’t actually know the relative costs here, and I choose not to believe anyone who claims that they know, if they can’t explain this result.
EDIT: to clarify, it very well might be that the benchmark is busted in some obvious way! But it really concerns me that I personally don’t have a mental model which fits the data here!
From what I understand, there are two factors at play (I could be wrong about both, so keep that in mind):
The number of context switches is something you might not be able to do much about, even with thread pinning. If you have N threads (where N is a large number) and you want to give them a fair time slice, you’re going to need a certain number of context switches to achieve that.
This means that we’re left with reducing the context switch time. When doing any sort of userspace threading, the context switch time is usually in the order of a few hundred nanoseconds at most. For example, Inko can perform a context switch in somewhere between 500 and 800 nanoseconds, and its runtime isn’t even that well optimized.
To put it differently, it’s not that context switching is slow, it’s that it isn’t fast enough for programs that want to use many threads.
Your two comments here are some of the best things I’ve read about the topic in a while! Consider writing a blog post about this whole thing! In particular,
Is not something I’ve heard before, and it makes some sense to me (though, I guess I still need to think about more — with many threads, most threads should be idle (waiting for IO, not runnable)).
I did write about this as part of this article about asynchronous IO, which refers to some existing work I based my comments on.
I’d been waiting for someone who knew more about the kernel guts to comment, but I guess that’s not going to happen, so here goes.
The context switching cost shouldn’t depend on the number of threads that exist, although there were one or two Linux versions in the early 2000s with a particularly bad scheduler where it did. I don’t buy that the number of context switches (per unit time) increases with the number of threads either in most cases; in a strongly IO-bound program it will depend solely on the number of blocking calls, and when CPU-bound it will be limited by the minimum scheduling interval*.
I am not convinced about the method in the repo you linked. Blocking IO and async are almost the same thing if you force them to run sequentially. Whether this measures context switch overhead fairly is beyond my ken, but I will say that a reactor that only ever dispatches one event per loop is artificially crippled. It’s doing all its IO twice.
Contrary to what one of the GH issues says, though, it’s probably not doing context switches. Like pretty much any syscall epoll_wait isn’t a context switch unless it has to actually wait.
This isn’t a degenerate case for blocking IO and that’s enough to make up for a bit of context switching. I think that’s all there is to it.
In general, though, the absolute cost of blocking IO is lower than I think almost everyone assumes. Threads that aren’t doing anything only cost memory (to the tune of a few KiB of stack) and context switches are usually a drop in the ocean compared to whatever work your program is actually doing. I think a better reason to avoid lots of threads is the loss of control over latency distribution. Although terrible tail latency with lots more threads than cores is often observed, I don’t know that I’ve ever read a particularly convincing explanation for this.
* Although that is probably too low (i.e. frequent) by default.
RAM is a lot cheaper nowadays than it was when c10k was a scaling challenge; 10,000 connections * 1 MiB stack is only ~10 GiB. Even if you want to run a million threads on consumer-grade server hardware (~ 1 TiB), the kinds of processes that have that kind of traffic (load balancers, HTTP static assets, etc) can usually run happily with stack sizes as small as 32 KiB.
That just means that the bar should be higher now: c10M or maybe c100M.
As for running OS threads with a small stack size, why should we have to tune that number when, with async/await, the compiler can produce a perfectly sized state machine for the task?
If your use case requires handling ten million connections per process, then you should use the high-performance userspace TCP/IP stack written by the hundred skilled network engineers in your engineering division.
Don’t try to write libraries to solve problems at any level of scale. Use a simple library to solve simple problems (10,000 connections on consumer hardware), and a complex library to solve complex problems (millions of connections on a 512-core in-house 8U load balancer).
Because you’ll need to tune the numbers anyway, and setting the thread stack size is a trivial tuning that lets you avoid the inherent complexity of N:M userspace thread scheduling libraries.
This is exciting, and I’m looking forward to trying it in place of iTerm.
What are the security goals and architecture? Lots of features means lots of attack surface, exposed to remote servers and files accidentally cat’d to the terminal. For spatial memory safety, does it/can it use Zig’s release-safe mode? For temporal memory safety, what allocation strategy is used? Are the internals friendly to targeted fuzzing? Are there dangerous features exposing command execution or the filesystem? Is bracketed paste supported adversarially? /cc @mitchellh
To be clear, this is not a list of gotchas, but of potential wins over other emulators that would make me particularly excited about switching.
Great questions. Ghostty 1.0 will be just as vulnerable to terminal escape security issues as basically any other terminal (as described in recent blackhat talks and so on).
We have some rudimentary protections common to other terminals (but also notably missing from many) such as an “unsafe paste warning” where we detect multi line pastes or pastes that attempt to disable bracketed paste and so on. It’s not sufficient to call a terminal secure by any means but does exist.
On Linux we have the ability to put shell processes into cgroups and currently do for memory protections but stop there. When I implemented that feature I noted I’d love to see that expanded in the future.
I think the real answer though is that this is one of my top goals for future improvements to the terminal sequence level as I noted in the future section. I’m working on a design now but I want to enable shells to drop privileges while children are running much in the same way OpenBSD has a syscall to drop privileges. For example, a shell or program should be able to say “ignore all escape sequences” (maybe except some subset) so things like tailing or catting logs are always safe.
The security framework is probably my first major goal for innovation in the future. For Ghostty 1.0, I’ve more or less inherited all the same architectural problems given by the underlying specifications of old (defacto and dejure).
If you’d like to be a part of this please email me, I have a ton of respect for your work and consider you far more of a security expert than me!
I realized I didn’t answer some of your other questions in my other response. Let me follow up with that now:
For memory safety, we currently recommend ReleaseFast. The speed impact of the safety checks on the safe build are too great and noticeably make some things not smooth. We need to do a better job of strategically implementing safety check disabling on certain hot paths to make safe builds usable, but that is something I want to do.
Our debug builds (not release safe) have extra memory integrity checks that make builds VERY slow but also VERY safe. I’d like to setup build machines that run fuzz testing on these builds 24/7 using latest main. This is not currently in place (but the integrity checks are in place).
Re: fuzzing: the internals are very friendly to fuzzing. We’ve had community members do fuzzing (usually with afl) periodically and we’ve addressed all known fuzz crashes at those times. It has been a few months since then so I’m not confident to say we’re safe right now. :)
Longer term, I’m interested in making an entire interaction sequence with Ghostty configurable in text so we can use some form of DST and fuzzing to continuously hammer Ghostty. That’s a very important goal for me.
I’m not super familiar with the intricacies of Zig, but does it have any way to toggle individual safety checks? IME there are things like bounds checks that are basically free and massively help with safety, and then you have things like null checks that aren’t as useful for safety but are still basically free, and then you have things like checked addition that are comparatively expensive but don’t really give as much on their own. That might be something to check out?
Great questions. Curious about this at well.
Can someone who is knowledgeable in both Zig and Rust speak to which would be “better” (not even sure how to define that for this case) to learn for someone who knows Bash, Python and Go, but isn’t a software developer by trade? I’m an infrastructure engineer, but I do enjoy writing software (mostly developer tooling) and I’m looking for a new language to dip my toes into.
The real answer is both. Rust’s borrow checker is a game changer. But Zig’s
comptimeis a game changer as well.If you only have space for one, then go with Rust as the boring choice which is already at 1.0
I second this and will also add that Zig’s use of “explicit” memory allocation (i.e. requiring an Allocator object anytime you want to allocate memory) will train you to think about memory allocation patterns in ways no other language will. Not everyone wants to think about this of course (there’s a reason most languages hide this from the user), but it’s a useful skill for writing high performance software.
Reminds me of Type Checking vs. Metaprogramming; ML vs. Lisp :-) Someone should write Borrow Checking vs Metaprogramming; Rust vs. Zig
I think the “both” answer is kinda right, which annoys me a little, because it is a lot to learn. But I can accept that we’ll have more and more languages in the future – more heterogeneity, with little convergence, because computing itself is getting bigger and diverse
e.g. I also think Mojo adds significant new things – not just the domain of ML, but also the different hardware devices, and some different philosophies around borrow checking, and close integration with Python
And that means there will be combinatorial amounts of glue. Some of it will be shell, some will be
extern "C"kind of stuff … Hopefully not combinatorial numbers of build systems and package managers, but probably :-)Whichever the case, you need to learn to appraise software yourself, otherwise you will have to depend on marketing pitches forever.
Try both, I usually recommend to give Rust a week and Zig a weekend (or any length of time you deem appropriate with a similar ratio), and make up your own mind.
If you’re interested in the more philosophical perspective behind each project, check out this talk from Andrew, creator of Zig https://www.youtube.com/watch?v=YXrb-DqsBNU
I’m sure Rust must have an equivalent talk or two that knowledgeable Rust users could recommend you.
If you’re new to low-level programming in general then Rust will almost certainly be easier for you – not easy, but easier.
Zig is a language designed by people who love the feeling of writing in C, but want better tooling and the benefit of 50 years of language design knowledge. If Rust is an attempt at “C++ done right”, Zig is maybe the closest there is right now to “C done right”. The flip side to that is part of the C idiom they cherish is being terse to the point of obscurity, and having relatively fewer places where the compiler will tell you you’re doing something wrong.
IMO the best ordering is Rust to learn the basics, C to learn the classics, and then Zig when you’ve written enough C to get physically angry at the existence of GNU Autotools.
I would also recommend “Learn Rust the Dangerous Way” once you know C (even if you already know Rust by then), to learn how to go from “C-like” code to idiomatic Rust code without losing any performance (in fact, gaining). It’s quite enlightening to see how you can literally write C code in Rust, then slowly improve it.
https://cliffle.com/p/dangerust/
FWIW, the main author of zig hates this comparison, and intends zig to replace C++ more than C.
(I can’t find him saying that right now, so it’s from memory)
https://mastodon.social/@andrewrk/113229093827385106
Thank you! I scrolled his feed a bit and must have skipped over it
The quote doesn’t say that he intends it to replace C++, just that he wants to use it for problems he previously used C++ for
That is a very important distinction, because I’m very sure there are lots of C++ programmers who like programming with more abstraction and syntax than Zig will provide. They’ll prefer something closer to Rust
I’m more on the side of less abstraction for most things, i.e. “plain code”, Rust being fairly elaborate, but people’s preferences are diverse.
BTW Rob Pike and team “designed Go to replace C++” as well. They were writing C++ at Google when they started working on Go, famously because the compile times were too long for him.
That didn’t end up happening – experienced C++ programmers often don’t like Go, because it makes a lot of decisions for them, whereas C++ gives you all the knobs.
http://lambda-the-ultimate.org/node/4554
Some people understand “replacement” to mean, “it can fill in the same niche”, while others mean, “it works with my existing legacy code”.
I always interpreted it to mean the former, so to me Zig is indeed a C++ replacement. As in, throw C++ in the garbage can, stop using it forever, and use Zig instead. Replace RAII with batch operations.
To the world: Your existing C++ code is not worth saving. Your C code might be OK.
Best 5 words argument I’ve ever read against RAII.
the raison detre for the language is “Focus on debugging your application rather than debugging your programming language knowledge.”
which seems aimed squarely at c++ rather than c
As a university student, I’d prefer Zig more. Zig is easier to learn (it depends) and for me, I can understand some knowledge deeper when writing Zig code and using Zig libraries. Rust has higher level of abstraction which prevents you to touch some deeper concepts to some content. Zig’s principle is to let user have direct control over the code they write. Currently Zig’s documentation isn’t detailed, but the codes in
stdlibrary is every straightforward, you can read it without enablingzlslanguage server or you can use a text editor with only code highlighting feature to have a comfortable code reading expreience.I am not an expert in Zig, but there was a thread by the person maintining the Linux kernel driver for the new apple that was written in rust about rust and zig here:
https://mastodon.social/@lina@vt.social/113327856747090187
Read the comments as well, @lina is much more into rust than zig, so those might provide some extra perspective.
More specifically, if you’re coming from Python and Go in particular, I think you will enjoy Rust’s RAII and lifetime semantics more. Those are roughly equivalent to Python’s reference counting at compile time (or at runtime if you need to use Rc/Arc). It all ends up being a flavor of automatic memory management, which is broadly comparable to Go’s GC too. And Rust gives you the best of both worlds: 100% safe code by default (like Python, in fact, even stronger since Python lets you write “high-level, memory safe” data races without thinking but Rust makes it more explicit) and equal or higher performance than Go, with fast threading.
Zig sounds more aimed towards folks that come from C, and don’t want to jump into the “let the compiler take care of things for me” world. That said, I’m not experienced with Zig by any means, so you might want to hear from someone who is.
Regarding the original post, what if de-initialization can fail? I always found RAII to be relatively limited for reasons like that
It shouldn’t always be silent/invisible.
And I feel like if RAII actually works, then your resource problem was in some sense “easy”.
I’m not sure if RAII still works with async in Rust, but it doesn’t with C++. Once you write (manual) async code you are kind of on your own. You’re back to managing resources manually.
I googled and found some links that suggest there are some issues in that area with Rust:
https://internals.rust-lang.org/t/wanted-a-way-to-safely-manage-scoped-resources-in-async-code/14544/4
https://github.com/rust-lang/wg-async/issues/175
Then why do people fuck up so much?
If the resource doesn’t need any asynchronous operations to be freed, works great. Which is to say, 99% of resources will still be handled by RAII.
I don’t know of any evidence that there are more mistakes, compared with say
deferAlso, please tone it down a bit … some of your comments are low on information, high on emotion
I read through it, and as someone who has used both that whole thread is not arguing well for zig, only for rust, it has a lot of trolls in it that probably just are after lina (I know there are multiple) Most of us who prefer zig to rust are not deranged loonies like many in that conversation.
The Meatlotion troll admitting they were a script kiddie at the end was the pure catharsis I needed today. Thank you.
This post on why someone rewrote their Rust keyboard firmware in Zig might help you understand some of the differences between the two languages: https://kevinlynagh.com/rust-zig/
Discussed on Lobsters
You’ll probably get along easier with Rust, but Zig might just bend your mind a little more. You need a bit more tolerance of bullshit with Zig since there’s less tooling, less existing code, and you might get stuck in ways that are new, so your progress will likely be slower. (I have one moderately popular library in Rust, but spend all my “free” time doing Zig, which I think demonstrates the difference nicely!)
Oh, I had the impression as an observer that this was the reverse. Doesn’t rust bend the mind enough?
I guess I think of what’s involved with learning to write Rust as more of an exercise (learn the rules of the borrow checker to effectively write programs that pass it), whereas imo with Zig there’s some real novelty in expressing things with comptime. It of course depends on your baseline; maybe sum types are new enough to you already.
One of the things I dislike about Rust’s documentation and educational material the most is that it’s structured around learning the rules of the borrow checker to write programs that pass it (effectively or not :-) ), instead of learning the rules of the borrow checker to write programs that leverage it – as you put it, effectively writing programs that pass it.
The “hands-on” approach of a lot of available materials is based on fighting the compiler until you come up with something that works, instead of showing how to build a model structured around borrow-checking from the very beginning. It really pissed me off when I was learning Rust. It’s very difficult to follow, like teaching dynamic memory allocation in C by starting with nothing but
nullpointers and graduallymallocing andfreeing memory until the program stops segfaulting and leaking memory. And it’s really counterproductive: at the end of the day all you’ve learned is how to fix yet another weird cornercase, instead of gaining more fundamental insight into building models that don’t exhibit it.I hope this will slowly go out of fashion as the Rust community grows beyond its die-hard fan base. I understand why a lot of material from the “current era” of Rust development is structured like this, because I saw it with Common Lisp, too. It’s hard to teach how to build borrow checker-aware models without devoting ample space to explaining its shortcomings, showing alternatives to idioms that the borrow checker just doesn’t deal well with, explaining workarounds for when there’s no way around them and so on. This is not the kind of thing I’d want to cover in a tutorial on my favourite language, either.
I don’t know Zig so I can’t weigh in on the parent question. But with the state of Rust documentation back when I learned it (2020/2021-ish) I am pretty sure there’s no way I could’ve learned how to write Rust programs without ample software development experience. Learning the syntax was pretty easy (prior exposure to functional programming helped :-) ) but learning how to structure my programs was almost completely a self-guided effort. The documentation didn’t cover it too much and asking the community for help was not the most pleasant experience, to put it lightly.
That’s a good one! There is a thin line between fearless and thoughtless.
If you like Go, you might like Zig, since both are comparatively simple languages. You can keep all of either language in your head. This means lots of things are not done for you.
Rust is more like Python, both are complicated languages, that do more things for you. It’s unlikely you can keep either one fully in your head, but you can keep enough in your head to be useful.
I think this is why many people compare Rust to C++ and Zig to C. C++ is also a complicated language, I’d say it’s one of the most complicated around. Rust is not as bad as C++ yet, since it hasn’t been around long enough to have loads of cruft. Perhaps the way Rust is structured around backwards compatibility it will find a way to keep the complications reasonable. So far most Rust code-bases have enough in common that you can get along. In C++ you can find code-bases that are not similar enough that they even feel like the same language.
It should also be noted that Zig is a lot younger than Rust, so it’s not entirely clear how far down the complicated path Zig will end up, but I’d guess based on their path so far, they won’t go all in on complicated like Rust and C++.
Well, @matklad is already here, but for me coming from Go and frustrated after some time trying Rust (two times) I was motivated to try Zig by @mitchellh talking with @kristoff why he chooses Zig for Ghostty (his terminal emulator project), and how it matches with my experience/profile…
… the reason I personally don’t like working too much in Rust, I have written rust, I think as a technology it’s a great language it has great merits but the reason I personally don’t like writing Rust is every project that I read with Rust ends up basically being chase the trade implementation around, it’s like what file is this trait defined, what file is the implementation is, how many implementations are there.. and I feel like I’m just chasing the traits and I don’t find that particularly.. I don’t know, productive I should say, I like languages that you read start on line one you read ten and that’s exactly what happened and so I think Zig’s a good fit …
Basically I’m more into suckless philosophy I think, also liked @andrewrk talking about the road to 1.0 and Non-Profits vs VC-backed Startup etc… So I recommend to create something real on both using the refs posted here, some rustlings (plus bleessed.rs) and ziglings (or my online version, plus zigistry.dev) to get a better fit for you ; )
At this point I felt crazy for even considering Rust. I had accomplished more in 4 days what took me 16 days in Rust. But more importantly, my abstractions were holding up.
@andrewrk before Zig on progress so far
Not speaking to the languages at all, but I’d say to choose the more mature language - Rust. Even after learning Rust, I still told people to just learn C++ if the goal was to learn that kind of language. That’s a trickier choice now (C++ vs Rust) because Rust has reached a tipping point in terms of resources, so it’s easier to recommend. Zig is just way too early and it’s still not a stable language, I wouldn’t spend the time on it unless you have a specific interest.
If the goal is just to reduce useless Pull Request by pedant, I’d suggest using language like “C++, D, and Go use control flow mechanism like throw/catch or panic/recover, which can prevent bar() from being called”. I believe it would get the message across without triggering tedious semantic argument about what is the one true definition of “Exception”.
Its easy to have conflicting opinion on whether or not panic/recover is about “exception”, and if it is or not a good practice to use it, but it’s a hard fact that it exists in the language and that it is control flow.
Agreed. It seems like the motivation of the Zig documentation is to show that Zig has no hidden control flow - which is great! Maybe it’s better to avoid comparisons to other languages in that respect. It’s not a zero-sum game, there’s no need for value judgements.
Agreed. Seems more pragmatic to do that simple change than referring to a blog post every time someones mentions it.
Almost full circle. They reinvented the “password manager”, but with extremely complicated specifications (FIDO2, WebAuthn), especially considering the purpose, a shitload of JS and a pile of legacy crypto on top 🤡
Except that password managers use symmetric cryptography (shared secrets). Passkeys use asymmetric cryptography (public-private key pair), which comes with a lot of security benefits.
I have never really been able to figure out what a passkey is so apologies if this is missing the point, but the password manager I use uses asymmetric cryptography. I haven’t really used 1password or lastpass or any of the popular password managers; do those work differently?
The underlying protocol is WebAuthn, and “passkeys” is basically a mass-market branded version of it which is getting a lot of attention.
The basic idea is that instead of having a password or other type of shared credential, for each site you create a public/private keypair and the site stores the public key as part of your account data. Then when you need to authenticate to the site, they present you with something to sign using the private key, and use the public key on file to check the signature.
The claimed advantages here are:
The last bit is really an understated killer feature. Non-password-based authentication systems have historically had terrible user experience, bordering on completely unusable for insufficiently-technical people. And public-key cryptography also has historically had terrible user experience (I recall a conference once where the statement “we will drink until PGP makes sense” was uttered, for example). Automating the whole process works wonders for improving the experience.
If everyone uses a password manager, the long and hashed unique passwords also don’t give the attacker anything
On the other hand, if someone gets your device in their hands, they now have immediate access to all sites. It’s also not trivial to revoke the access - you can’t just use another device to do it, because 1.) how do you which passkeys to revoke and 2.) what stops the attacker from doing this before you do it? But if you have to follow a different process (without passkeys) to revoke passkeys then we are back to that this different process can be used for phishing attacks.
And when it doesn’t it becomes much more complicated. And on the implementation site it also becomes more complicated hashing and comparing passwords is a very simple thing to do in comparison.
All in all, a lot of open questions and the question is really if the added complexity does not create more (maybe yet unknown) security issues in combination with human usage than password managers.
Ok, there’s a fundamental misunderstanding of the threat model here.
Passkeys protect against all the actual real world threats. They do not protect against someone with physical access to your hardware, and knowledge of the login/unlock credential of that hardware. But if that is your threat model, passwords also fail.
However, even in the case of that level of compromise, the design of passkeys was intentionally such that the actual key material could not be extracted, so even that level of compromise would not allow for secret use of the credentials because passkeys were still restricted to the host device. Obviously once you add the ability to export credentials now an attacker with this level of access can extract that key material alongside all your passwords.
To go through your points:
Which they get with your passwords already
All of them, just like all your passwords
Nothing, just like your passwords
Just like passwords.
On the other hand, all that “complicated” stuff means:
Again, the problem here is a failure to recognize the actual threat model. Outside of specific (expensive) case the real world problem is not someone stealing you/your user’s hardware and then using the independently discovered passwords from that hardware (because stealing your hardware doesn’t magically bypass your device passwords). But if that is a threat model that matters in your use cases, then passwords are still broken there as well, and the issue is that while objectively more secure and harder to exploit than passwords, passkeys are still possibly not good enough.
The overwhelming attack vector in the real world is phishing, social engineering, malware, and similar attacks, where the goal is to get you (or your system) to - by some mechanism - provide all the required log in credentials for the target site. Something that cannot be done with pass keys, because of the complicated stuff you’re questioning the value of. There is little to no interest in physical access to your devices, and in general the relevant parties would like to not even be in the same countries as their victims, let alone the general vicinity.
The problem of “what if someone has access to my device[s] and the passwords/codes required to unlock them” is outside of the bounds of any of this, because at that point they can access any and all of the data that device ever has. Again, prior to this proposal to support exporting passkeys, passkeys were still superior to passwords because by design your attacker would not be able to copy the key material and so could not just silently reuse your passkeys at their leisure, which they can easily do with passwords, which is to my mind much worse.
Not if the passwords are in my head only. Which is true for the really important ones.
Yeah, but if I have access to all the passkeys and create new ones, why would I need the “credentials” (i.e. the private key)?
Yes they can because of the “different process” for recovery. You might claim that this is harder to abuse, but I’m not so sure. It might turn out it’s not.
See above
And the attacker only needs one device instead of two now. Just like with a password manager.
That last part is really the only thing that is really positive - it’s guaranteed that an attacker can’t access the service even if they are able to get a copy of the database of the service. With passwords, the service needs to have the password hashed and not in plaintext.
Not sure if that’s worth all the disadvantages…
Then, even if they are good, they’re just as susceptible to phishing to a password manager, only a password manager is not as easily fooled by online phishing scams as people are. Neither really has meaningful defense against multi step social engineering.
Because passkeys are not passwords - they’re a key that used for a cryptographic operation. Signing in is not a matter of vending a secret to the host. The host service transmits a nonce, the browser takes that nonce, looks up the passkey for service it has connected to over TLS, if it finds one, and then you approve it (via some local authentication mechanism - i.e. malware compromising the device cannot arbitrarily perform these operations, that’s part of the reason for the hsm involvement) - then the passkey implementation’s hsm signs a response that includes the server provided nonce, and the browser sends that back to the server. Which validates your current session.
So having access to your device, and the ability to unlock your device, an attacker can sign into a service that uses a passkey - just like they could with a password manager. But now, this person - who remember, we’ve already declared has your device passwords - can copy all of your passwords off your device to use at their leisure. To abuse their current compromise of your device they would have to add an additional passkey (something you can and should be notified about, and you can see the list of valid keys).
The attacker doesn’t need any of your devices, that’s literally why phishing works. 2FA increases the cost of compromise, but in your threat model your accounts are valuable enough for them to be working towards and successfully getting:
So the costs of intercepting the standard 2FA schemes is not a major issue, more over they can extract the state from any of your 2FA apps given they have access to your 2fa apps and device passwords they can use those to add their own devices to your 2fa mechanisms.
You seem to not be understanding the attack vectors here.
The reason for password hashing is not to protect your own service. At the point your servers are compromised to the point that an attacker has access to your authentication database, your service, and all of the accounts it hosts are completely compromised. The reason for password hashing is so that an attacker cannot take the passwords from your server, and then try them on other services. Run by other people.
Password hashing is not a method to protect your service, or to protect your users on your service. It is solely to protect your users against those secondary attacks, which do not otherwise involve you at all.
Passkeys render that problem moot, not because there’s no password to leak, but because the client will never use a passkey from Service A to authenticate against anything that cannot prove that it is itself Service A. So authentication is not relayable (and so phishing and spoofing are not possible), and that any given authentication response is not replayable, so a moment in time compromise does not provide data that allows the attacker to authenticate again in future.
The purpose of passkeys is to secure users against actual real world attack vectors, rather than hypothetical attacks, especially ones as extreme as what you are suggesting. They are more secure than passwords in every practical sense, they are more secure than any multi factor scheme.
They are not robust against your proposal of a person who has gained physical access to your hardware, and who has the ability to authenticate themselves as you on that hardware. But that vector is even more catastrophic for passwords, including those that are in your head, because at that point your attack vectors include malware, key loggers, malicious extensions, etc which mean your passwords can be intercepted, and then used at will by the attacker.
That’s your call, but I’m concerned that you are making that call without an accurate threat model (your go to model is a local hardware attack by someone with you device passwords), nor an accurate model of the complexity, like I suspect you have no problem with your site using TLS, despite TLS being vastly more complex.
Again: you either lose full access once you lose your passkey(s) - or you have a recovery flow and now a phisher does not need access to the device but rather to the information in the recovery flow.
Honestly, instead of making assumptions, maybe try to understand what I’m saying.
And yes, my threat model might be different than yours. I see anything between me and a service as a potential threat and that includes my hardware, my OS and so on.
And yes, I do have my doubts about TLS and let’s not pretend there haven’t be major security issues with those certificates before.
You’re doing something 99% of users never will.
And the attacker can’t open my device, and thus can’t access my passkeys, because they would need to know my pin code (that’s in my head), or use physical violence to get me to put my thumb on the fingerprint scanner.
Or steal your fingerprint from a glass…
modern fingerprint sensors scan the tissue underneath the outer skin. it’s not just the oil pattern left by the wrinkles
Maybe. If the manufacturer gives me a guarantee and is regulated by law and has to have a liability insurance, I might put my trust it it.
Do CIA and NSA rely on it? No? Then I have my doubts.
If you can get 100% of people to follow 100% perfect security hygiene 100% of the time, sure, technical improvements in the nature of the credentials don’t matter. I wish you luck in your attempt. One thing passkeys and other non-password-based systems try to address is the fact that we’ve never succeeded at this and so maybe “fix the people” isn’t going to be the solution.
The mobile-device implementations I’ve seen rely on the mobile device’s biometrics to gate access to passkeys. Same for desktop implementations – lots of computers have a fingerprint reader, for example. And lots of people already set up that stuff for convenience, which means you don’t have to change user behavior to get them to do it.
Besides, someone getting their hands on your device is an immediate breach of your password manager, too, and if you want to argue the password manager should be locked by a password or biometrics, well, passkeys should be too, so password managers offer no improvement.
“Simple” hashing and comparing of passwords is anything but.
At any rate, it seems pretty clear that you are not really open to having your mind changed and that no amount of argument or evidence would change it, so I’m going to drop out of this conversation.
The same is true for passkeys. If no one uses them, they don’t help. Or do you mean forcing everyone to use passkeys? Yeah, well, great. And sure, you can’t force people to use password managers.
And is that guaranteed to be secure? Can a Scammer not fake your fingerprint with all those devices? As far as I know the devices can usually be tricked pretty easily.
That’s actually true and it is the reason why I’d like to see a password manager that limits the number of accesses over time. Ofc. for that to work it can’t run locally.
Compared to passkeys? I think it is.
But the shared secret (password) is still symmetric cryptography. This matters because with passkeys, some bad actor who controls the website for a few seconds can extract your token and use it for a while. But with a password, it can extract the password and then use this to generate new tokens.
(there are ways around this even with symmetric cryptography such as hash-chains, but it’s not really used in the wild)
More a logical extension to the concept of a password manager, not a reinvention
That really only makes sense as a rebuttal if there are significant advantages. The password manager workflow is that I create a secret key which I store in a secure key store, I give that key to the website, and then the website gives my client a temporary access token when I provide my key. That temporary access token is then used for day-to-day authentication, requiring renewal using the secret key every now and then or when authenticating a new device.
This, honestly, is a pretty great system. What’s more, it’s universally supported everywhere already. What benefit does passkeys actually provide compared to this?
A shared secret, not a secret key.
I could go on, but I’m sure any search engine could answer this question in great depth.
Passkeys make Sign in via Apple / Google style flows accessible to all without vendor lockin. I look forward to them being better baked.
What do you mean by “passkeys can’t be leaked”? Surely they could theoretically leak if e.g Apple’s passkey syncing service is compromised?
Specifically:
They are not vulnerable to interception. If your browser is compromised or some part of the connection is insecure, for example, and you use a password then that password is visible as plain text. An attacker can copy it and then later, from another machine, log into the service. In contrast, with a pass key the attacker sees only the challenge-response for the current session. They cannot replay that later.
They are not vulnerable to leakage if the server does not store them securely. A lot of systems over the past decades have been compromised and had password databases leak. Some of these were unhashed and so were complete compromises. If you reused a password, other systems may be compromised with it. Even with hashing, older hash schemes are vulnerable to brute force attacks and so attackers can find passwords. With a pass key, the server holds only the public key. Exfiltrating this does not help the attacker (unless they have a quantum computer, but it’s easy to move passkeys to PQC before this becomes a realistic threat).
They cannot be reused. You may have a single secret, but this is used with the remote host name in a key derivation function to provide a unique private key for each service. This prevents impersonation attacks because your key for service X and your key for service Y are guaranteed to be different. If you go to a phishing site and it forwards a challenge from another site, you will sign it with the wrong key and the login will not work.
If Apple’s sync service is compromised, passkeys can leak, but this would have to happen at the point when you set up a new device. The secure elements on the sync’d devices perform a key exchange protocol via iCloud, and with iCloud providing some attestation as to device identity when you enable syncing. The exchanged key is never visible to another device (iCloud can’t intercept it, but it could make your source device do the key exchange with the wrong target. This is similar to a TOS handshake, where iCloud is the certificate signing authority). After this point, iCloud only ever sees cypher text for wrapped keys and so can’t decrypt them even if it is completely compromised (though it could trick users into initiating the sync flow to an untrusted device). I believe the secure elements also have a certificate that’s baked in at device provisioning time so you’d need to also compromise that if you wanted to sync keys to a device that let you extract them, rather than to another Apple device.
Leaked by extraction from some WWW.
I’m talking about passkeys in general. As for Apple, they end-to-end encrypt the transfer of passkeys; so, no, they couldn’t theoretically leak.
For one, this system does not allow a central entity to monitor and disable someone’s access to all their services at once. (you didn’t specify who the benefit should apply to…)
Hm doesn’t it? Can’t Apple or 1Password delete all my synced passkeys just like they can delete all my synced passwords?
Maybe, but you can also use a local OSS password manager (e.g. keepass).
I had no idea C was supporting this out of the box.
I relate a lot to the state synchronization issue. I migrated a project from Go+Templ+HTMX to Go+OpenAPI+React (no SSR) because of this.
HATEOAS is a nice idea, but when your application has a lot of client-side interactivity and local state, it becomes (at least for me) to keep a clean codebase that do not turn in a spaghetti monstrosity.
I mean, “state synchronization” is (IMO) the whole problem React is intended to solve. So when folks balk at the way that React work and advocate for a stripped down lib, my question is always, “okay, so how are you planning on solving it?”
React solves state synchronization on the client-side, but not between the server and the client. Actually, this second part often becomes more difficult as one adds more interactivity client-side. That’s what leading some teams (for example linear.app) to treat the sync problem at the database level, and replicate the database between the server and client. Then the React client becomes a “simple” view on top of the database replica.
Incredible and excellent outcome! Thank you Cloudflare!
This is really neat - I hadn’t seen it before, it looks like the page was created 14th of September.
Something I’ve worried about in the past is that if I want to make a point-in-time backup of a SQLite database I need enough disk space to entirely duplicate the current database first (using the backup mechanism). This tool fixes that - I don’t need any extra disk space at all, since the pages that have been updated will be transmitted directly over the wire in 4096 byte chunks.
Hm, that seems inefficient. Assuming it uses 256-bit hashes and 4KB pages, that’s roughly 1% of the total database size as overhead, even before it starts sending any changed pages.
I would have done this using something like a Prolly tree, or Aljoscha Meyers range-based set reconciliation, which only have to send O(log n) probes to find the differences. They can also detect a no-op (no differences) by sending just one hash.
Agreed, very useful tool with a really neat implementation.
Not sure why the author says this. Ruby has had keyword arguments for over a decade, since at least version 2.0: https://www.ruby-lang.org/en/news/2013/02/24/ruby-2-0-0-p0-is-released/
Even in the years before that, it was Rails convention to pass Hashes and pretend to have keyword arguments that way. (By Ruby 2.x, keyword arguments were supported with native syntax.)
You’re correct, but as I mentioned in the post, Python has no special syntax for specifying a keyword argument. You can define keyword-only arguments, but every positional argument can also be specified as a keyword argument when you call it.
The difference is in defaults and flexibility: obviously in Ruby you can define every method you personally write as a keyword argument, and I’m sure that there’s a cop you can enable to enfore that, but Python does the right thing by default. If the codebase at work were written in Ruby instead of Python, I don’t think I could rely on past colleagues to have defined methods that take 7 arguments using keywords, but since it’s Python I don’t have to worry about it.
Yes, Python is super flexible in that regard, but I often find that the interaction of positional arguments, keyword arguments, default values, and varargs, lead to some overly complex APIs. I’ve been guilty of abusing it myself. I can see why some languages like Zig are avoiding this.
I think it’s valid to conclude that the keyword arguments experiences of Python and Ruby are still very different and that you might long for the Python one (as someone who trod a very similar path to the article author, using Ruby intensely for over a decade before having more to do with Python again). You can’t throw a
kwargs-style hash at a Ruby method if it’s expecting positional arguments, even if those positional arguments do have names. But you can of course use thex(y=z)syntax; there’s just no way to specifyyprogrammatically. The confusion of both is unfortunate, and let’s not forget that period in-between where there was all that munging of kwargs.Tough bug. Great report. Awesome seeing Colin Percival jump in there :)
Anyone else trying github.com/Xudong-Huang/may like the author?
Though I am moved that my work has had such a profound impact on the author, may is unsound (and says as much in the README).
Sounds a lot like Erlang’s “let it fail”.
I find over and over again that any system that models the world as intercommunicating tasks and has a designer who really works out all the details for robustness… tends to look a lot like Erlang.
Well, the post even mentions Erlang, in the context of only having one supervisor instead of an Erlang-style supervisor tree.
Had the same thought while reading the post.