Fencing tokens don’t work well unless you have a single source of truth around enforcing them, which creates another dependency that this system alone doesn’t solve. And if you’re using a distributed DB that has serialization as that enforcer, well then you don’t need S3 to do the locking anymore.
Locks can often be used as an optimisation. If you’re using a primary system with optimistic concurrency control + retries, then you can use an external lock service to arbitrate access, and minimise the number of wasted retries the clients need to do.
Fencing tokens don’t work well unless you have a single source of truth around enforcing them
I don’t understand, can you clarify please? In what sense do they not work, or what’s the failure mode?
My understanding is that you do art least get the property “every node that takes the lock does so with a unique fencing token number”. Because if two clients both try to take the lock with the same fencing number then only one will succeed, since S3 is implementing atomic CAS here when it is sent an if-match conditional request.
Right and a single S3 file is a single source of truth, but if you don’t write to the sam S3 file, then fencing tokens are useless. The point is they have to be able to converge on something that’s able to enforce it.
In the design described in the linked article they do write to the same S3 file. They use a single S3 object for the lock and they CAS it from one state to the next when updating it.
Yes I’m aware that’s the use case they had, but what I’m saying is the way they present fencing tokens is as a generalized solution, which it’s not - it only works when you can converge on something
Happy to answer questions about Dropshot, I use it every day.
Specifically, Dropshot can produce OpenAPI documents, and then https://github.com/oxidecomputer/oxide.ts can generate a typescript client from that diagram. I personally am also using sqlx, so I get types the whole way from my database up through into the browser.
It’s not a perfect web server, but it serves us really well.
A common pattern I see in HTTP APIs is to treat responses like sum types where the status code determines the structure of the body and so on. For example, the Matrix protocol’s media APIs use 200 when media can be served directly and 307/308 if a redirect is necessary. Is it possible to model this in Dropshot such that it’s reflected in the generated OpenAPI document?
It seems like it probably isn’t, because each request handler function is permitted to have exactly 1 successful response code and several error codes (which must be in the 4XX and 5XX range, see ErrorStatusCode), and each error code shares the same body structure. Looking at the API design, ApiEndpointResponse has an Option<StatusCode> field, HttpCodedResponse has a StatusCode associated constant, and HttpResponseError only has a method for getting the current status code of an error response value. Looking at the generated OpenAPI document for the custom-error example seems to support this conclusion. Am I missing something?
Not 100% sure but I believe that’s true today, yes – HttpCodedResponse has a const status code.
Modeling this is an interesting challenge – I can imagine accepting an enum with variants and then producing the corresponding sum type in the OpenAPI document.
Yeah, that’s the solution I came up with for a similar project I attempted that was more warp-shaped than axum-shaped; here’s an example. The derive macro generates metadata about the response variants and uses the variant names to decide the HTTP status code to use. Maybe useful as prior art, I dunno. The obvious downside is that ? won’t work in this situation without either Try stabilizing or defining a second function where ? works and doing some mapping in the handler function from that function.
In the design I posted, there aren’t really “successful” or “unsuccessful” responses, just responses. Responses of all status codes go into the same enum, and each handler function directly returns such an enum, not wrapped in Result or anything. So if you want to use ? to e.g. handle errors, you have to define a separate function that returns e.g. Result which uses ? internally, and then call that function in the handler function and then use match or similar to convert the Result‘s inner values into your handler function’s enum.
I believe that’s a long winded way of answering “yes”, but I thought I’d elaborate further to try to make it clearer what I was trying to say originally just in case.
Ah I see. What do you think of separating out successful and unsuccessful variants into a result type? 2xx and 3xx on one side, 4xx and 5xx on the other.
That could work. I think doing it that way could be more convenient for users because ? would be usable (though probably still require some map and map_err), but at the cost of some library-side complexity. Using a single enum obviates the need for categorizing status codes. Dropshot has already solved that problem with its ErrorStatusCode (but you’ll probably also need a SuccessStatusCode, which I’m not sure currently exists).
Personally, I don’t think either way would make much difference for me as user. When working with HTTP libraries, I generally implement actual functionality in separate functions from request handlers anyway, so that such functions have no knowledge of any HTTP stuff. There are various reasons for this, the relevant one being that this minimizes the amount of code that has to actually care how the HTTP library is designed. Not everyone operates this way, though.
Something about the headline doesn’t quite click for me: is this a web framework? Namely, does it provide the web server loop as well? Or is it just related to the data model mapping between types and endpoints?
Yeah, it has the web server loop too. It’s maybe a bit too bare bones to be a “framework,” like it’s closer to a flask/sinatra than a Django/Rails. But it’s focused on “I want to produce a JSON API with OpenAPI” as a core use case.
Having done more asynchronous in TypeScript/JavaScript than rust I am used to the fact the call might do some amount of synchronous work before returning (or perhaps lie and not return a promise at all, or return a resolved promise, etc). You might need to code defensively, as the caller.
Is that meant not to happen at all in rust?
And where did the “lazy_futures” string come from? Is the the name of the crate and the default span of main in this case?
The usual line is “Rust futures do nothing until they are polled.” This is true, but misses the case described in the OP. An asynchronous function constructs a future at compile time, but a regular function returning a Future type can do work in advance of building the future. That’s entirely consistent with the rules, but is a nuance people new to asynchronous Rust can miss.
OK, that matches my intuition, so I guess I am fortunate.
Personally, I feel rust introduces a programming style where consuming code can rely on guarantees, and also forces you to check things you don’t know (even if the run-time check simply introduces a panic).
Here, the ecosystem relying on an entirely enchecked assumption that a function returning a impl Future doesn’t do anything before polling seems to be the opposite, YOLO style of programming. “Gee I sure hope you don’t free this pointer I’m lending you.”
I’m now wondering how to you could design it to enforce this assumption. The caller communicates with the executor, and has the callee be called by the executor? Hmm, if you did that it would probably be more syntactically obvious what is going on with an await prefix keyword rather than a .await method-like suffix on a computed result (a Future). (The way I understand the current design is it makes it easy to optimize away allocations for each .await since the compiler can smoosh all the state into a single enum-like thing that the runs as a state machine upon polling - not sure if this is accurate, and whether similar optimizations could be made with a different design).
Most of the time you would expect folks to be writing async functions, where this problem doesn’t exist.
Boats has described this as “registers of Rust” where writing async functions vs. writing functions returning futures are different registers.
That said, sometimes you have reason to move to a lower register, which is where this confusion can arise. At each lower register you have more control but also must take greater care. The compiler ensures you don’t introduce memory unsafety, but it doesn’t protect against issues like the one in the OP.
Boats has described this as “registers of Rust” where writing async functions vs. writing functions returning futures are different registers.
These are not necessarily different registers, but Boats’s definitions! His final register is “async/await syntax”, and the main way I write functions that return futures is to use async blocks inside the function. Just as easy as writing an async function, but you get more control.
It’s one of those things where while you can, it doesn’t mean that you should. Any work that may need to suspend the task will need to get access to a waker, which is only accessible when a future is being polled (rather than when the function is called). So the principle of least surprise implies that you shoulldn’t do any interesting work outside of the future.
Also, relying them being inert until polled enables useful features, like the tracing functionality.
It’s useful to make distinction between threads as a programming language abstraction vs threads as an OS feature. I am not entirely sure here, but I think my ideal world looks like this:
OS <-> Application interface is exclusively through batched, asynchronous sys calls, a-la io-uring. There simply isn’t such a thing as “blocking read”.
On top of it, you get essentially Go — the API feels like threads, but it’s a language-runtime abstraction, not OS abstraction
Though, I am not 100% sure that threads > async/await as an abstraction:
the most natural structured concurrency API I’ve seen is the taskless subset of Rust async/await: join(cat(), mouse()).await
I am intrigued by the njs comment that the main problem with async/await are promises, that the syntax is retrofitted for an existing monadic library. I am curious how the language would look like if it had only async functions, without exposing the user code to the promise type.
OS <-> Application interface is exclusively through batched, asynchronous sys calls, a-la io-uring. There simply isn’t such a thing as “blocking read”.
On top of it, you get essentially Go — the API feels like threads, but it’s a language-runtime abstraction, not OS abstraction
So kernel threads at the bottom and userland threads at the top but coupled using a not-thread API? I don’t know that that’s necessarily wrong but I wonder if it’s really better than just cutting out the middle… interface. Performance-wise I guess you get potentially multiple results per kernel<->application switch. But one could imagine an OS threading model that solves that problem, too.
the most natural structured concurrency API I’ve seen is the taskless subset of Rust async/await: join(cat(), mouse()).await
Doesn’t seem difficult to have the same over a somewhat standard channel abstraction though? e.g. cat and mouse can spawn a thread and return a oneshot channel, or the thread itself can be a oneshot channel yielding the thread’s result (as Rust’s do).
It’s not about returning results, it’s about making sure that both cat & mouse would have finished by the time we are done with the expression. The futures version is much simpler than the nurseries one: https://doc.rust-lang.org/stable/std/thread/fn.scope.html
This requires opt-in. If you don’t call join, concurrency leaks. The thing about structured concurrency is not joining when you want to, but ensuring that everything is appropriately joined even if you don’t think about it.
No, if you don’t join the futures, you still maintain the invariant that nothing continues to run past the end of the expression (by virtue of the future not even starting before you attempt to join it).
The bug pointed out by smaddox is illustrative here. It totally flew past both me and you. But any version with futures that compiles would necessary either wait for both to finish, or cancel the other one.
I have real hot and cold feelings towards structured concurrency/automatic cancellation. It’s often really useful but many times introduces bugs of its own.
For example, with a select or try_join over mpsc::Sender::send, if the future is cancelled then the message is lost forever. This didn’t used to be clear in the documentation – I updated it to be clearer and suggest using reserve in those cases instead.
More generally, I think cancellation introduces spooky action at a distance in a way that can be quite unpleasant, especially for write or orchestration operations.
You absolutely need implicit context passed everywhere. Cancellation tokens will not be passed cleanly and in the same way as I think structured concurrency must be impossible to misused (eg: automatically attach parent) i also think nothing must accidentally lose the cancellation token.
Python’s TaskGroup is incredibly easy to get frustrated by because many libraries in the python ecosystem cannot cancel properly. For instance if you use iofiles to read from a pipe but you also have a task that fails, then the task group locks up forever (or until someone closes the pipe/writes into it). You will not see the error because it’s deferred until the cancellation succeeds.
Except if the first join call returns an error, then you never make the other join call… So it doesn’t ensure both threads have finished. And, as pointed out by matklad, calling this isn’t enforced.
For context, this is what go does*, and the ergonomics get awkward. Eg: if you want to use any kind of non-channel construct (eg: an errgroup, or context), the construct has to expose a channel, which can get pretty cumbersome.
Others have elaborated on the Rust approach, but if you’re curious about where it comes from, it’s worth looking up Concurrent ML.
* Just on the off chance anyone hasn’t experienced Go.
This is very interesting. I have solved this problem a different way.
Instead of firing NOTIFY for the data directly, I write a row into a queue table with table, row metadata. The insert into the queue fires NOTIFY, and the listeners can look up the correct record from the metadata. Every message has a monotonic sequence number, and listeners track the newest sequence they’ve seen. When they start up, they select all rows newer than the latest. Poor-mans Kafka, basically.
There’s more to the picture, but I’m leaving stuff out for brevity. But yes, you should persist the monotonic sequence. This value will essentially become part of a vector clock once this grows to multiple systems.
How do you deal with downtime of all listeners in combination with ill-ordered sequence numbers due to failed transactions?
Basically a listener could read number 2, then go down - then number 1 is being written (because the transaction took longer than for number 2) and the notify does not reach the listener because it’s down, then the listener starts up again, reads from 2 and gets notifications for 3 and 4 etc but misses 1.
Okay, I’ll paint a more detailed picture. The goal is to replicate data changes from one system to another in real-time. They live in entirely different datacenters.
System 1
A traditional, Postgres-backed UI application
An UPDATE happens in our Postgres database related to data we care about
This fires a trigger that writes a row into our event queue table, combined with a sequence number. the sequence number comes from a Postgres Sequence.
The INSERT fires a NOTIFY that alerts a publisher process.
The publisher uses the NOTIFY metadata to query the changed data and produces an idempotent JSON payload
The payload is published to an ordered GCP pubsub topic using the sequence number as ordering key
When the event is successfully published to the topic, the event row is deleted from the queue
System 1 (restart)
publisher queries event queue for any unprocessed events
LISTEN for new events
System 2
An in-memory policy engine with a dedicated PostgreSQL DB
A pool of journaler processes watch the pubsub topic for events
When an event happens, a journaler writes it to the journal table and acknowledges the message
The journal INSERT fires a NOTIFY
Every replica of the system has a replicator process that received the NOTIFY
The replicator consumes the event payload and updates the in-memory data
The specific record retains the sequence number that created it, and a global sequence is updated monotonically
A heartbeat process checks that the in-memory system is live, and updates its heartbeat row in the PostgreSQL DB with its current sequence
A reaper process runs on an interval that deletes stale heartbeat records (a process restart will have a different identity)
a pgcron job finds the oldest current sequence in the heartbeat table and flushes any journal entries below it
flushed journal entries are written to a permanent cache that enforces monotonic updates, so that only the newest version of each record is retained
System 2 (restart)
An in-memory dataset means a restart is a blank slate.
begin by loading the permanent cache into memory
query the journal for all records newer than the current global sequence
LISTEN for new journal writes
Because the sequence is atomically incremented by Postgres, this means that a failed transaction results in a discarded sequence. That’s fine.
We are using DB tables as durable buffers so that we don’t lose messages during a system restart or crash.
If there is a race condition where a record is updated again before the previous event is published, that’s also fine because the event is idempotent so whatever changes have taken place will be serialized.
We use an ordered pubsub topic on GCP to reduce the possibility of out-of-order events on the consuming side, and we enforce monotonic updates at the DB level.
First of all, thanks a lot for the detailed reply. I appreciate!
I think you actually currently have a potentially race-condition (or “transaction-condition) bug in your application. You are saying:
Because the sequence is atomically incremented by Postgres, this means that a failed transaction results in a discarded sequence. That’s fine.
It is atomically incremented, but it is not incremented in order. I made a mistake by mentioning “failed transactions” but what I actually meant was a long-lasting transaction. So indeed, a discarded sequence number does not matter, I agree with you on that. But what does matter is if a transaction takes a long time. Let’s say 1 hour just for fun even if it’s unrealistic.
In that case, when the transaction starts the sequence number is picked (number 1). Another transaction starts and picks 2. The second transaction finishes and commits. The trigger runs, the notify goes out and the listener picks it up and writes it somewhere. It remembers (and persists?) the number 1. NOW the listener goes down. Alternatively, it just has a temporary network issue. During that time, the second transaction finally commits. It triggers the notify, but the listener does not get it. Now the listener starts (or the network issue is resolved). A third transaction commits with number 3. The listener receives the notify with number 3. It checks it’s last transaction number which is 2, so it is happy to just read in 3. Transaction 1 and its event will now be forgotten.
Possible solution that I can see: if skipped sequence numbers are detected, they need to be kept track of and need to be retried for a certain time period T. Also, on restart, the listener needs to read events from the past for a certain time period T. This T must be longer than a transaction could possibly take.
Alternatively: on a notify, don’t query by the meta data from the notify. Use timestamps and always query by timestamp > now() - T. To improve efficieny, keep track of the sequence numbers processed during the last query and add them to the query AND sequence_number not in [...processed sequence numbers].
There also seems to be a way to work with transaction IDs directly and detect such cases but this is really relying on postgres internals then.
I had a similar issue with a log (basically kafka for lazy folk who like postgres), and ended up adding annotating log entries with the current transaction Id, and on each notification, querying for all rows whose transaction IDs were smaller than that of the longest-running open transaction. I wrote it up too.
Yeah, you’re right, but the functions I mention return a 64-bit value (transaction Id and an epoch), and I was pretty happy that it’d take several million years for that to wrap around. At which point it would be someone else’s problem (if at all).
Thanks for going into more detail. I don’t think the scenario you’re describing is actually a problem for us, I’ll bring you even further into the weeds and let’s see if you agree.
The event queue is populated by a trigger than runs AFTER INSERT OR UPDATE OR DELETE, which detects changes in certain data we care about and INSERTs a row into the event queue table.
The INSERT triggers the sequence increment, because this is the default value of the seq column.
So, the only sort of race condition that I could think of is multiple changes hit the table for the same table row in rapid succession, but remember that the publisher emits idempotent event payloads. So we might end up with multiple events with the same data… but we don’t care, as long as it’s fresh.
The ordered deliver from Google pubsub means that rapid updates for one row will most likely arrive in order, but even if not we enforce monotonic updates on the receiving end. So in the unlikely event that we do process events for a row out-of-order, we just drop the first event and use the newer one.
The SQL statement (including the trigger and everything it does) runs within the same transaction - and it has to, so that either the insert/update/delete happens AND the insert into the event queue table happens, or none of it happens.
So then, the problem with a “hanging” transaction is possible (depending on how your queries look like and come in I suppose) and that means there will be points in time where a listener will first see an event with number X added and then, afterwards(!), will see an event with number X-1 or X-2 added. Unless I somehow misunderstand how you generate the number in the seq-column, but this is just a serial right?
The ordered deliver from Google pubsub means that rapid updates for one row will most likely arrive in order, but even if not we enforce monotonic updates on the receiving end. So in the unlikely event that we do process events for a row out-of-order, we just drop the first event and use the newer one.
So if you have only one producer and your producer basically only runs one transaction at a time before starting a new one, then the problem does not exist because there are no concurrent transactions. So if one transaction were hanging, nothing would be written anymore. If this is your setup, then you indeed don’t have the problem. It would only arise if there are e.g. two producers writing things and whatever they write increases the same serial (even if they write completely different event types and even insert into different tables). I kind of assumed you would have concurrent writes potentially.
Also see the response and the link that noncrap just made here - I think it explains it quite well. From his post:
The current implementation is perfectly fine when there is only a single producer, when you start to have multiple producers logging to the same place using batched operations (transactions), it it turns out that this can fail quite badly.
The SQL statement (including the trigger and everything it does) runs within the same transaction - and it has to, so that either the insert/update/delete happens AND the insert into the event queue table happens, or none of it happens.
So then, the problem with a “hanging” transaction is possible (depending on how your queries look like and come in I suppose) and that means there will be points in time where a listener will first see an event with number X added and then, afterwards(!), will see an event with number X-1 or X-2 added.
What you’re describing here appears to contradict PostgreSQL’s documentation on the READ COMMITTED isolation level:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row.
I guess I’m misunderstanding your setting. I was thinking you have multiple processes writing into one table A (insert or update or delete) potentially in parallel / concurrently, then you have a trigger that writes an event into table B. And then you have one or more listener processes that read from table B.
If you remove the “parallel / concurrently” from the whole setting, then the problem I mentioned does not exist. But that is at the cost of performance / availability. So if you lock the whole table A for writes when making an update, then the situation I’m talking about will not happen. But that is only if you lock the whole table! Otherwise, it will only be true for each row - and that is what the documentation you quoted is talking about. It’s talking about the “target row” and not “the target table”.
I was thinking you have multiple processes writing into one table A (insert or update or delete) potentially in parallel / concurrently, then you have a trigger that writes an event into table B. And then you have one or more listener processes that read from table B.
Yes, that is correct. The publisher LISTENs to Table B for new events, and the event payload indicates what data has been mutated. It queries the necessary data to generate an idempotent payload which is published to pubsub. So if there are multiple, overlapping transactions writing the same row, READ COMMITTED will prevent stale reads. There is no need to lock entire tables, standard transaction isolation is all we need here.
If we were to introduce multiple publisher processes, with multiple LISTEN subscriptions, then we would need to lock the event rows as they are being processed and query with SKIP LOCKED.
I learned a few things here but hitting the PostgreSQL parameter limit has a “you’re holding it wrong” code smell.
If you’re writing data, format a CSV. You can even stream it over the wire.
If you’re doing a query you’re supposed to use joins. It’s a relational database.
If your ORM spit out a query with a million parameters your beef is with the ORM, not PostgreSQL.
If you or the ORM were doing a giant IN statement you can replace it with a join on a temporary table. You can GC them easily, they’re scoped to the current connection and you can easily create unique ones and drop them when done. It gives you a chance to load the table with the output of a select statement omitting a round trip between client and server with all your parameters. Using multiple selects to load the table gives you a lot of flexibility for whatever business logic you’re trying to implement here. Any constraints you put on the temporary table can be used by the planner. You can analyze the temporary table for more join statistics for those situations with a truly huge parameter count.
If you have a ton of parameters and they’re coming from some external system you can combine the above: format them into a csv, load them into a temporary table, join that table up in your query.
100% agreed. Also, if you really need that big of an id IN (1, 2, 3, ...) expression, you can use id = ANY(array[1, 2, 3, ...]) where the array is a single parameter in which you stuff all your ids.
Honestly I’m not sure why it’s a smell. It feels like a circular logic that it’s a smell just because postgres doesn’t support it well - but that’s a postgres problem then…
The idea that you can pass an unlimited CSV that needs serialising and parsing again, but can’t do the same with native parameters seems backwards.
Debug it on a small selection of values? It’s easier to do than debugging a query that used a temporary table which doesn’t exist anymore but the time you get the error report. At least the query included all the data.
I’m not very convinced by that article. There are reasons besides amateurism to design an IDL with a restrictive type system. When you want bindings in many different languages, you need to be thinking about their least common denominator. A feature like user defined generic types would be challenging to support across all languages in a consistent way.
Eh this is just a horrible article, and shouldn’t be re-circulated
It’s certainly valid to say that protobufs are not appropriate for a use case, or even not appropriate outside Google except in limited circumstances (e.g. if you don’t have a monorepo)
Granted, on paper it’s a cool feature. But I’ve never once seen an application that will actually preserve that property. With the one exception of routing software, nothing wants to inspect only some bits of a message and then forward it on unchanged. The vast majority of programs that operate on protobuffers will decode one, transform it into another, and send it somewhere else.
But he’s missing the whole point, and hasn’t worked on anything nontrivial that uses protobufs
I would go look at what this person has actually built, and then go look at what people have built with protobufs. (I didn’t actually go look, but I can already tell by the way this article is written)
We use JSON-RPC ( https://www.jsonrpc.org/specification ) - which is really just a loose standard around formatting an RPC call over the wire as JSON. We don’t have any regrets going with that over gRPC (which was the only real other alternative when we picked it). I like that it covers virtually all use cases you’d need, is easy to hand-write when necessary, and (with a bit of TypeScript or JSDoc) can give you most of the benefits that you get immediately without moving into the realm of diminishing returns.
FlatBuffers seem popular as a replacement. I’ve not used them but some teams 8 worked with moved to them and found that they were fast and simple to use.
I’ve meant to write a rebuttal to this for years, but I think it’s a question of framing.
If you frame it about as describing a language (ie: a grammar for serialising a particular type) then it makes a little more sense. The whole issue is that he’s trying to cram an ADT style worldview (which is relatively tightly coupled, but that’s okay because it’s in-process) into a distributed world, where versioning is a problem.
TBH the biggest problem with gRPC is A) protobufs are bad and B) most people conflate gRPC and protobuf without realizing you can replace the encoding with a better one.
I honestly don’t have a problem with gRPC on its own, provided it’s used with a good encoding.
I’m on the other end of this (been using gRPC and loving it – mainly the end-to-end type safety), what made it suck in your experience, and what alternatives did you look at that might be better?
I wonder whether anyone has tried to fold system load into the concept of time. I.e. time flows slower if the system is under load from other requests.
The atomic stop bool in the code example is a bit of a distraction. In the example, the author sets the stop flag and then immediately terminates the runtime. You cannot expect the stop semantics to work if you don’t wait for the spawned tasks to complete. There is no point to that. So, if we think the stop flag away, what remains is the insight that spawned tasks can be terminated in any order which does not feel as much as a footgun to me.
The point of the stop flag is to actually show what kind of (real, non-abstract) thing could break as a result of the unspecified ordering, which is indeed the only bold sentence on the page:
The Tokio runtime will drop tasks in an arbitrary order during shutdown.
If you wrote the same code without the stop flag, you’d probably handle this correctly by intuition, writing each task in awareness that it is just sitting there in a big sea of runtime, hoping someone else can hear them — if not, when you hit the interrupt, you’ll likely find out.
But if you design it with the stop flag first, you might be thinking and planning more for the ‘expected’ shutdown sequence, and therefore neglect the case where the runtime is dropped instead, causing a bit of an unscheduled disassembly.
So I think that insight is the only one the page is trying to convey, but the stop flag isn’t a distraction, it’s motivating that insight.
The reason the stop flag does not work is because the runtime is shut down before it can even do its work. That’s the reason it’s broken, doesn’t really have anything to do with the order of when tasks are dropped. If the authors suggested solution (to drop everything after stopping all the tasks) were applied here then it would happen to solve this specific manifestation of the underlying bug, but the bug remains. If the receiver task for example would have some other cleanup code inside the if statement then that may never execute even though apparently that is what the author intended.
thinking and planning more for the ‘expected’ shutdown sequence
True–but there’s no actual explicit shutdown sequence present. In order for the tasks to shutdown cleanly, the author should be waiting for tasks to clean themselves up once signalled (and you’d need a mechanism that would actually notify the task, too).
It seems that they’re assuming that the runtime will run the tasks to completion; which could block indefinitely, and just as non-obvious.
It seems that they’re assuming that the runtime will run the tasks to completion; which could block indefinitely, and just as non-obvious.
I wonder if that is their experience? I’ve had so many quirks with tokio getting to the end of my async main and then just… stalling. I don’t know why, I assume some task is being blocked, possibly by an I/O operation. It’s always whenever I’m doing something with networking and don’t take the absolutely utmost care to isolate and control the connection to make the async runtime happy. But in some of my prototypes I had to literally put a std::process::exit() at the end of my async main because it would get to that point and then just keep running. I never figured out exactly what causes it, it just happens.
I feel like a lot of received wisdom in software is essentially folk knowledge, where folk don’t necesarrily go out seeking expertise on how to use tools. And in a way this is a cunning way to handle that problem.
The paper Mock Roles not Objects makes the point that by getting folks to focus on the relationships and roles that different objects take, you can end up with more minimal, and stable interfaces, and thus more stable tests. Wheras the common usage is to use mock objects to stub out an arbitrary object, which may be more chatty than it needs to be, and so your tests end up being more brittle.
Fakes are a great way to handle that, as you can embed the model of how the real component works, so that not everyone has to remember the intriciacies and subtleties of a chatty protocol.
I feel like a lot of received wisdom in software is essentially folk knowledge, where folk don’t necesarrily go out seeking expertise on how to use tools.
For software in general this does ring true to me. But for testing in particular, my view is almost the opposite: there’s actually very little good information on how to do it, and an abundance of outright bad advice.
For example, it is absolutely critical to think in terms of size/scope/extent dimensions of test. But among all the testing advice I’ve read over the years, this is only articulated with sufficient clarity in one book, SWE@G. Almost everyone thinks in terms of unit and integration tests, which hopelessly muddle important and inconsequential dimensions of tests together.
Or, to use the example on hand, I’d classify the “Mock Role not Objects” paper as snake oil, on two grounds.
First, although mocks perhaps can be useful in select circumstances, by a large, “don’t use mocks” is the correct advice to give directionally. Mock-promoting paper which doesn’t say “actually, you should almost never use this” is bad advice. In particular, no, their worked example doesn’t require mocks and is harmed by their introduction. To test timed caches you need to a) introduce fake time b) hock up logging/metrics with your test to assert hit rate.
Second, the paper treats Clock and Reload policy on equal ground, failing to make the absolutely critical distinction that the time dependency is a different kind of thing. Time dependency is what increases that all-important size metric of the test. Failing to note and to communicate that time is really special is, again, bad advice.
For software in general this does ring true to me. But for testing in particular, my view is almost the opposite: there’s actually very little good information on how to do it, and an abundance of outright bad advice.
I whole-heartedly agree on this one. Most folks I’ve met seem to take a naive-empiracist approach, which without critical examination of the results, does lead to quite poor results.
size/scope/extent dimensions of test … unit and integration tests, which hopelessly muddle important and inconsequential dimensions of tests together
Do you know of any articles on this? I think I have feel for what you mean, but i’d love to see a fully fleshed out take on this. It’d help me to understand you later argument, that depends on the notion of a distinct size metric.
Or, to use the example on hand, I’d classify the “Mock Role not Objects” paper as snake oil, on two grounds.
I think the thing to bear in mind here is that it’s a paper presented at a research conference, and so a specific description of a particular technique, rather than being a general guide to how to write tests.
The main point of the paper is that mocks can work really well when you’re using them to help define how the clients want to use a dependency, before you commit to implementing that dependency.
My point is, that using mock objects well requires a change in approach and thinking, rather than just adopting a tool. If you don’t approach them with a curiosity about their affordances, and consider how they might best be used, then yeah, they’re going to be a pain.
For example, if you have a large, chatty interface, and you end up concludiing that the mocks are painful in that case, then great! If you can then ask, why is this painful, and how do I make it less painful, then wonderful.
It does occasionally seem like we have a generation of carpenters trying hammers, hitting their thumbs a few times, and deciding to just use glue and screws for evermore. But hell, if that works for you, great!
Second, the paper treats Clock and Reload policy on equal ground, failing to make the absolutely critical distinction that the time dependency is a different kind of thing.
I mean, i can see it in terms of how we generally consider spacial dimensions as qualitatively different from the time dimension, and yeah, there’s definitely ways you could express the example in §3.5 more clearly.
It’s also worth bearing in mind that the paper is 20 years old, and we’ve learnt a bunch in that time. Eg: they use expect() when querying the policy or time source, when you should be able to tell it works just from the result, in this case.
Poor, or outdated examples doesn’t necesarrily invalidate the notion of focussing on roles/relationships, and need driven development though.
This mirrors the common practice in statically typed functional languages e.g. Haskell. You write types upfront to stub out the domain and guide the implementation, then keep iterating types/functions until you get what you want. In a way the strong IO/pure separation and static type system removes the need for mock tests entirely.
This is definitely a wierd take. Having tried to learn haskell (but not quite had the patience) I’m sure the author is entirely correct about the benefits. However, it doesn’t really consider that these tools are situated in a socio-technical system, and you have to consider the predudices and habits of well, your typical developer. And for most people, they’d likely consider that learning purescript/halogen seems like too much hard work for little reward.
I mean, I can understand the post if the developer was being forced to write React/typescript under duress, and was just having a shitty day. I know all too well what it’s like to work with what feel like under-powered tools.
Is it technically better at everything? Maybe. Is it better in the context of existing software developers? Maybe not.
Well, when we’re using average developers as arguments, let’s not confuse ‘what technology the average developer already knows’ with ‘what technology leads to the best outcomes when average developers must learn and use it’.
Ya, very true, what’s ‘best’ definitely varies across groups and people. I was trying to make a slightly different point, though: I think that ‘What technology do people already know?’ is an unreliable rule of thumb that overvalues the time cost of learning a new technology.
Many things are built on unsuitable technologies or ad-hoc invent-as-you-go frameworks, merely because “spend time up front learning language/library X that solves this” was taken off the table as an option. And that’s why I object to “what to people already know” as a major decision factor.
To digress a bit: Often, it’s not the time cost of learning that’s unaffordable, but the social cost. Many technologies would repay the time cost of learning, even if the entire team or department or every new dev has to learn — but the social cost to spending time learning means it is not attempted. Cat Hicks has done good research (real research! she’s a social scientist!) on this; the whitepaper Coding in the Dark.
Final digression, a quote from Coding in the Dark:
Reputational costs will supersede “developmental” ideals; when code reviews focus on performance over learning, code writers feel pressured to hide their actual learning.
“Crunchy peanut butter is better than creamy” you claim, but you fail to consider the context of people with peanut allergies. What a silly opinion on peanut butter you have!
Over the past couple of years I’ve started believing that “test code” is fundamentally different from “prod code” in ways that make good prod idioms not applying. Another one: Make Illegal States Unrepresentable is a really bad idea in formal specifications.
Although I would guess it’s a similar idea to why a web form can carry data that wouldn’t be considered valid by the backend (or even by the HTML5 validation attributes). It’s a lot easier to report on what’s actually wrong if you can represent those invalid states. So my feeling is that in a spec, it makes it easier to make assertions about what makes a state valid, but also make it easier to see how it’s wrong.
It’s basically what @crstry said: the point of writing a spec to make sure your system doesn’t enter a bad state, but in order to know that you have to be able to represent the bad state you’re trying to prevent.
Like, imagine you’re trying to prevent cycles in a dependency tree. If your specification makes cycles unrepresentable, you can’t model the case where someone hand-codes a cycle in package.json.
Wouldn’t the job of the formal specification be complete by having the illegal states be unrepresentable? You can assert there are no cycles either through the formal specification or through the data structure that the formal specification is proving. Either way, the job is done.
Yeah, making states in the “machine” unrepresentable is a good idea, but you still want to test that the unrepresentable states cover all the invalid states. So you still want to make them representable in the overall spec.
This is a place where refinement is real useful. AbstractSpec has representable but invalid states, ImplSpec tries to make them unrepresentable, proving that ImplSpec => AbstractSpec means that you succeeded.
I was hoping the discussion here would be about the title, and not the technological details of why Tailwind is better than standards-based CSS.
But now that we’re here, the developers who promote Tailwind need to explain to me (in simple terms) why I should write my next HTML button with this syntax taken directly from Catalyst — the latest and greatest of Tailwind:
Marketing is off-topic here. I’d have removed this link if I’d seen it earlier, but there’s some substantial comments and it seems unlikely the discussion is going to go off-topic or into a flamewar.
It’s described longer here. Speaking of marketing, you need to stop exploiting Lobsters as a marketing channel. Stop submitting your own work until it’s less than a quarter of your site activity.
It might have been in scope if your other seven stories weren’t exclusively advertising your products and all 67 of your comments weren’t all in your own marketing posts. The threshold for “in scope” is lower if it’s clear you’re posting in good faith, and so far, you are not.
You are being encouraged to participate in topics posted by others to counterbalance your participation in topics that are posted by you about your own projects and products.
I get that, but that’s just not how I work. I tend to isolate myself inside my coding chamber and then come back with a thing that I want to get feedback on. During my time of development, I usually stay away from social media platforms like X, Hacker News, or Lobsters.
the developers who promote Tailwind need to explain to me (in simple terms) why I should write my next HTML button with this syntax taken directly from Catalyst
I’ve never developed with Tailwind, but it is clear to me that the Catalyst project does not suggest that you write your next HTML button with that syntax. The Catalyst docs for Button show the actual source code a developer would write to get that button:
import { Button } from '@/components/button'
function Example() {
return <Button>Button</Button>
}
The HTML you copied is just the generated output of a build step. Developers aren’t meant to work directly with that generated output, so how messy it looks doesn’t matter. (Unless you’re arguing it causes performance problems, in which case I’d need to see a benchmark against equivalent CSS to be convinced it’s significant.)
And answering your follow-up question from the article:
Do I need another wall [of text] for the white button?
It’s the same component as the above, but with an outline prop: <Button outline>Button</Button>.
As @adamshaylor noted, Tailwind’s docs recommend that you extract combinations of reused CSS and reused HTML as components. In other words, Tailwind assumes that you already have a component or template partial system in your front-end. I think that’s your fundamental misunderstanding about Tailwind. Catalyst is just an example set of such components.
The assumption that a “component” abstraction is already available helps explain why Tailwind recommends using its own method of abstraction, @apply, sparingly.
For more analysis of the connection between Tailwind and front-end components, including exploration of component-related solutions that aren’t Tailwind, I recommend MrJohz’s excellent comment on Hacker News:
I think understanding Tailwind on its own terms, with its own merits, is really valuable for understanding the state of CSS development right now, what works well, and what needs to improve. […]
Tailwind isn’t just a different way of writing CSS declarations, it’s a way of tying those declarations to components. This is nothing new - it’s the same concept as BEM, or CSS modules, or CSS-in-JS, or scoped CSS [links added], amongst others. Essentially, when writing large projects with extensive CSS, most people run into maintainability issues, and most people seem to develop similar solutions: they tie styles to an already existing, more maintainable unit of code, namely the component. […]
Understanding Tailwind on its own terms, then, we start to get answers to a number of the questions that the author asks. […]
[…] Tailwind exists alongside so many other, similar solutions. And I can fully understand why someone may not like the Tailwind solution specifically - its DSL for defining decorations is idiosyncratic, particularly if you’re already familiar with CSS. But this article doesn’t contrast Tailwind with those other tools, instead it compares having a solution for these issues with just not having a solution at all, and then tries to tell you that ignoring the problem is the better plan.
It feels to me that this piece comes more from a place of vitriol, with some post-hoc reasoning thrown in. Eg:
This was a hefty statement as it contradicts with all the prior work and studies about CSS
This might be true, but I can’t see that there’s a whole lot in the piece (eg: references) that helps us see where the author is coming from.
Ultimately, there’s nothing making you use tailwind (beyond, perhaps, a job–but such is life). Sure, there’s a whole bunch of asinine tech nonsense out in the world–but I don’t think this will persuade anyone who doesn’t already think along these lines.
With all due respect, this blog is much more subjective and emotion-based than you might think. You barely have contradicted Adam’s points, you only contradict the marketing. Also, the title is just clickbait and outright attacking - is this really how you want to present your ideas?
I think Adam’s first blog post is still a very good read, let me quote this part:
When you think about the relationship between HTML and CSS in terms of “separation of concerns”, it’s very black and white. You either have separation of concerns (good!), or you don’t (bad!). This is not the right way to think about HTML and CSS. Instead, think about dependency direction. There are two ways you can write HTML and CSS:
“Separation of Concerns” CSS that depends on HTML. Naming your classes based on your content (like .author-bio) treats your HTML as a dependency of your CSS. The HTML is independent; it doesn’t care how you make it look, it just exposes hooks like .author-bio that the HTML controls. Your CSS on the other hand is not independent; it needs to know what classes your HTML has decided to expose, and it needs to target those classes to style the HTML. In this model, your HTML is restyleable, but your CSS is not reusable.
“Mixing Concerns” HTML that depends on CSS. Naming your classes in a content-agnostic way after the repeating patterns in your UI (like .media-card) treats your CSS as a dependency of your HTML. The CSS is independent; it doesn’t care what content it’s being applied to, it just exposes a set of building blocks that you can apply to your markup. Your HTML is not independent; it’s making use of classes that have been provided by the CSS, and it needs to know what classes exist so that it combine them however it needs to to achieve the desired design. In this model, your CSS is reusable, but your HTML is not restyleable.
CSS Zen Garden takes the first approach, while UI frameworks like Bootstrap or Bulma take the second approach. Neither is inherently “wrong”; it’s just a decision made based on what’s more important to you in a specific context. For the project you’re working on, what would be more valuable: restyleable HTML, or reusable CSS?
This is an important, objective idea, and is always missing in these discussions. I personally think that the CSS Zen Garden approach is cool, but is simply not how any project I have ever worked on works. You don’t change company look by a CSS stylesheet, besides some basic stuff.
The other direction is very commonly applied, just often not knowingly. I wager that you yourself even work in something like this - this is not tailwind specific. Simply by explicitly thinking this way about the problem helps tremendously - I don’t think there is any silver bullet within how you achieve this, you either have to have good namespacing (local CSS with custom HTML components works just as well), or use something like tailwind (you can again, just use HTML components to reuse code - you ideally only have in the code base, not , at least for something more complex).
I wager why many people like tailwind is that it removes the non-local property of CSS, which let’s be honest, is just a huge source of issues. “Cascading” is often more harmful than useful. Tailwind (and many other tools) that simply reset styling can make maintenance of large code bases much easier (your single line CSS change for this site won’t break some button n levels deep in some completely unrelated site).
Maybe, but their messaging mostly talks about the practicality of this vs. other approaches, and indeed, your article makes other claims about the practicalities; so it’s pretty disingenuous to claim that it’s purely about messaging without touching on the substance.
Eg: with this article, you talk about the amount of generated CSS–bear in mind that it’s explicitly designed to be used with dead code elimination. Eg: on one of my projects, that it reduces down to about 5kb, most of which is CSS normalisation.
And your argument about the markup maybe reasonable, but it’s kind of meaningless without talking about overall impact. Eg: does it have negative performance impact? To what extent? Without that, it just comes across as “Hey, look how ugly this is!”.
Never mind the claim that they discourage using @apply to build more semantic classes, when it’s an example on the front page.
Also, you seem to predicate most of the argument that the reader who uses CSS just isn’t actually very good at using CSS. And even taken at face value, this can be read as tailwind meeting it’s customers where they are. Unlike the Neu approach, which comes off as kind of exclusionary. I mean, i’m all for encouraging folks to get better with their tools, but this isn’t the way to do it.
Something very similar was the original idea that led to Scheme. In the terminology of the time: everything should be an actor. As the initial article states:
This work developed out of an initial attempt to understand the actorness of actors. Steele thought he understood it, but couldn’t explain it; Sussman suggested the experimental approach of actually building an “ACTORS interpreter”. This interpreter attempted to intermix the use of actors and LISP lambda expressions in a clean manner. When it was completed, we discovered that the “actors” and the lambda expressions were identical in implementation
While working on this, it led to the usage of continuations as a fundamental primitive, and using continuation-passing-style in the compiler to hide this from the end user. This would allow for “actors” to suspend themselves and pass control on to another.
Now, actors and coroutines aren’t exactly the same, but they are similar enough and one can be used to implement the other. The article call-with-current-continuation patterns by Ferguson and Deugo has a section specifically about how to implement coroutines using continuations.
Suspension of an actor (or coroutine) can also forcibly happen after some time has passed, which is a natural way to implement green threads (we do this in CHICKEN).
I’ve seen a similar idea show up in responses to this post in a few other places it’s been shared: that this coroutines idea is just rediscovering the actor model. And it is, in a way. What interests me is not so much the idea of coroutines-as-basic-primitive (which is, after all, less powerful than call/cc, and morally equivalent to delimitied continuations). What interests me more is the sequence type system, its similarity to state machines, and the structure and guarantees it adds to coroutines. call/cc is often considered too powerful for its own good, much like goto, and a type system like this could make coroutine-based programming more practical.
I really Concurrent ML (ie: mostly first class, composable “operations”) as a model for concurrency, especially cases that don’t easily fit the one process one mailbox model of actors. As a taster, I really enjoyed this article building an implementation from scratch: https://lobste.rs/s/jfxp39/new_concurrent_ml_2017
That’s a fun issue. FreeBSD does document that nbytes can’t be larger than INT_MAX and doing otherwise will EINVAL. Oddly enough write(2) apparently relaxed this 10 years ago, but has a debug.iosize_max_clamp ioctl to restrict it back, either the read(2) manpage is wrong or it skipped that relaxation. It might be wrong, looking at the rust stdlib it hit that issue back in 2016 (#38590) and added a special case for macOS, this special case is still in there and still only for macOS.
MacOS does not document this limit for read(2) but it might be inferrable from readv(2) stating:
[EINVAL] the sum of the iov_len values in the iov array overflowed a 32b integer
OpenBSD explicitly documents a limit of SSIZE_MAX, netbsd is more implicit but apparently makes similar guarantees:
[EINVAL] […] the total length of the I/O is more than can be expressed by the ssize_t return value
I suspect that most things that want to read that much actually want to mmap the file. Maybe reading 1 GiB from the network at a time may make sense for very high-throughput activities, but by the time you’re doing 1 GiB the overhead of the system call is amortised to nothing and doing multiple reads in userspace will be unlikely to make a measurable difference (really curious if anyone has counterexamples!).
Memory mapping files is scary. When using normal I/O operations, I/O errors are communicated as return values from function calls, which is easy to deal with. When memory mapping the file, I/O errors are communicated by the kernel sending your process a SIGBUS signal, and that might happen at literally any memory read or write operation in the mapped region.
In a lot of contexts, reading the whole into memory at the start of the program will be the right approach. Computers have loads of RAM now, 1GiB is hardly an unreasonably large file to store in memory any more.
Of course, reading the file into memory in chunks smaller than 1GiB is perfectly fine and won’t cause performance issues, people probably aren’t reading files in one go due to performance but because this foot-gun is really non-obvious. “Get the size of the file, then malloc that amount of memory and read() that amount of data” is how the API looks like it wants to be used.
At my previous job, we had a program that would mmap() (read-only) it’s data rather than using read(). In the 13 years I was at the company, not once was there ever an issue with IO using the scheme, and the programs would run 24/7 serving requests. Also, it allowed better memory utilization as multiple instances of the program would be running, and they all could share the same pages of memory.
This might be one of those things where if you start to notice issues from an mmap(2)ed memory region, you’ve probably got bigger issues at hand, as exactly the same mechanism for demand-paging in executables as you do for reading your data file. Naturally, that’s more complicated if you actually have different devices for data vs. control plane work.
Although unless you have a sophisticated storage subsystem, I suspect (I’ve not done sysadmin, much less DBA work for a long time) that the best thing to do is replace the disks and rebuild the node/volumes anyway.
The problem though is, other programs can modify the file while it’s mapped. A user’s editor could truncate it and write new content to it, for example.
After playing a bit in a VM, unless I made a mistake somewhere it looks a lot like FreeBSD’s read(2) was updated, but the manpage was missed. I can’t exclude having screwed up the test program, but it seems to happily read or write 4GB buffers from and to files (and does indeed report having read / written 4GB).
Did you try a 32-bit program? I think, on 64-bit kernels, 32-bit programs are able to map almost the entire 4 GiB address space and so should be able to allocate a 3 GiB buffer and read into it. That seems like the kind of thing that might not work (having objects that are more than half the size of the address space has odd interactions with C because you can’t represent offsets within them in a ptrdiff_t. It probably works, given that the 32-bit compat layer uses the same kern_read function and should just zero extend the arguments.
Why would I? It’s not relevant and I don’t care about it.
The behaviours freebsd documents are that IO operations are limited to sizes of either INT_MAX or SSIZE_MAX. On 32b platforms they’re the same thing, that is useless to the question I’m asking, which is whether read(2) is restricted to INT_MAX still, or it’s limited to SSIZE_MAX and the manpage is outdated.
You can’t read 2GB or more with read(2) on a 32 bit system because the return value is signed and negative values are errors. I suppose, strictly speaking, only -1 is an error, but I know I am careless about that detail in my code!
C’s mismatch between size_t and ptrdiff_t, and POSIX ssize_t are one of those things that’s basically broken, and would be causing failures all the time, except we moved to 64 bits before it really started hurting.
Is there no way for userspace multitasking to be preemptive? That is, at least on Linux, are threads the lightest weight primitive of preemptive scheduling?
Preemptive multitasking ultimately requires a way to interrupt the CPU which is executing some arbitrary code, without requiring cooperation. In the OS kernel usually some kind of timer interrupt fires routinely to mark the end of a scheduling slot, preserves the task that was executing at the time, and puts another in its place. This happens periodically, switching tasks back and forth so they all get some time.
In a UNIX process you don’t have interrupts, but you do actually have signals. Signals are not actually that different from hardware interrupts: you can mask their delivery in critical sections, they can be observed to occur in between any two instructions (they’re completely asynchronous), they nest, and they preserve the register state of the interrupted task. Many signals sent on UNIX systems are process directed and sent from another process to the victim process. They don’t have to be though! UNIX systems generally offer a thread directed signalling system call which can send signals to a specific thread within the same process. There are also signalling mechanisms that can send a signal repeatedly based on a timer.
With either timers set to deliver signals, or a control thread that sleeps and wakes and uses thread directed signals to interrupt other threads, you can achieve a form of preemptive multitasking entirely in user mode. You can interrupt another thread that’s hogging the CPU, preserve its register state (the signal handler is given a ucontext_t which contains this and other critical state), and switch another previously preserved task onto the thread just like an operating system would using system calls like setcontext() and setcontext().
In general I’m not sure it’s valuable to do this. It’s complex, and would be hard to get right (signals are, like interrupts, notoriously difficult to reason about). The kernel generally has more and better information, and cheaper tools to act on that information with respect to context switching.
If anyone is looking for an example, it looks like the Go runtime (as of 1.14.0) has a control thread which sends a signal to other threads for preemption.
Thank you for your educational response! 🙏🏾 TIL that signals can nest.
In general I’m not sure it’s valuable to do this. It’s complex, and would be hard to get right (signals are, like interrupts, notoriously difficult to reason about). The kernel generally has more and better information, and cheaper tools to act on that information with respect to context switching.
Threads are the golden path for preemptive multitasking; simple and robust. But do threads have higher overhead than above scheme? That is, we have greenthreads / M:N schedulers / etc. because threads can’t be packed as dense as these userspace context switching mechanisms. But, to the best of my knowledge, they’re all cooperative multitasking mechanisms.
(Ab)using signals seems like a sick hack; but, I suppose my real question is do kernels offer similar golden path for preemptive multitasking?
But do threads have higher overhead than above scheme?
In the pre-emptive user-space threads case, the over head of kernel-space pre-emptive threads is a large in-kernel struct and other kernel-level data structures associated with the thread.
In user-space concurrency, you can eliminate two other big overheads by avoiding pre-emption and “stackful” threads. Cooperative threading requires saving much less execution context than pre-emptive threading. Stackless concurrency also frequently requires much less context, i.e. you don’t need to save the entire stack, just the explicit context associated with each coroutine (but you have to be careful with heap fragmentation here).
None of these matter when you only have a relatively small amount of threads but when you get into 10^5 or 10^6 region, it starts to matter.
On the flip side of that, a context switch in the kernel scheduler is (usually) a lot cheaper than signal delivery. Signals require tens of thousands of cycles or work on typical *NIX systems. The main benefit of userspace scheduling is that you can switch between userspace threads cooperatively much more cheaply.
There’s an annoying property for scheduling that cooperative scheduling has much better performance when everyone is playing nicely but catastrophic performance when they are not. In contrast, preemptive scheduling has far better worst-case performance than cooperative, but is less good when everyone is playing by the rules. The ideal model is probably a cooperative threading engine with preemptive fallback. The thing I really want for this in a desktop OS is a way of controlling the next timer interrupt from userspace and getting the signal only if it expires (and, similarly, another hardware timer for the kernel to control to trigger switches to other processes). This would let me a userspace scheduler keep pushing the signal horizon forwards as long as it’s context switching fast enough, but as soon as one thread hogs the CPU, it gets preempted.
In a user space threading system I wrote, I did this for exactly these efficiency reasons. I realized late in the process that pre-emption had significant negative efficiency implications.
This is actually an area where OpenBSD and other libc-dependent systems like macOS have an advantage over Linux. Since system calls must pass through a C function call interface, you don’t have to save as much context during the system call. It could be nearly as fast as a cooperative user space swap (modulo cost of entering and leaving kernel mode, which is cheap, or at least was pre-PTI). I’m fairly certain they currently just save the entire user space execution context as during pre-emption, for simplicity’s sake.
Even not doing that, the scheduler should already be able to reset the timer on context switch but I think for, again, simplicity’s sake it always just runs at 100hz. Linux does something smarter here now in this vein called “tickless” mode.
I don’t think length is necessarily the problem. Personally I find curl(1) (and bash(1), a similarly long man page) to be very legible; but it helps to have internalised some rules for jumping to particular sub-sections.
I very rarely use curl, so have never internalised their commands, it every time I use it I am able to quickly find what I want the answer in the man page for what I want to do with a quick search for a relevant term.
It sounds like you’re looking for a more introductory or tutorial style of documentation? If that’s the case, that’s totally fair. Man pages are great as a reference, but awful as a way of learning something from the ground up.
Fencing tokens don’t work well unless you have a single source of truth around enforcing them, which creates another dependency that this system alone doesn’t solve. And if you’re using a distributed DB that has serialization as that enforcer, well then you don’t need S3 to do the locking anymore.
Locks can often be used as an optimisation. If you’re using a primary system with optimistic concurrency control + retries, then you can use an external lock service to arbitrate access, and minimise the number of wasted retries the clients need to do.
I don’t understand, can you clarify please? In what sense do they not work, or what’s the failure mode?
My understanding is that you do art least get the property “every node that takes the lock does so with a unique fencing token number”. Because if two clients both try to take the lock with the same fencing number then only one will succeed, since S3 is implementing atomic CAS here when it is sent an if-match conditional request.
Right and a single S3 file is a single source of truth, but if you don’t write to the sam S3 file, then fencing tokens are useless. The point is they have to be able to converge on something that’s able to enforce it.
In the design described in the linked article they do write to the same S3 file. They use a single S3 object for the lock and they CAS it from one state to the next when updating it.
Yes I’m aware that’s the use case they had, but what I’m saying is the way they present fencing tokens is as a generalized solution, which it’s not - it only works when you can converge on something
Oh, so it’s not the MUI some people are already familiar with.
Or even an older MUI :)
Happy to answer questions about Dropshot, I use it every day.
Specifically, Dropshot can produce OpenAPI documents, and then https://github.com/oxidecomputer/oxide.ts can generate a typescript client from that diagram. I personally am also using sqlx, so I get types the whole way from my database up through into the browser.
It’s not a perfect web server, but it serves us really well.
A bit of a random one, but is the name a reference to dropwizard by any chance?
Fishpong is https://github.com/fishpong/docs/wiki/Primer, it’s a ping-pong variant that a lot of folks at Oxide love. “drop shot” is a concept that’s across various racket sports. https://en.wikipedia.org/wiki/Drop_shot
It also has the advantage of being short and wasn’t taken at the time.
I’m not sure! I’ll ask.
A common pattern I see in HTTP APIs is to treat responses like sum types where the status code determines the structure of the body and so on. For example, the Matrix protocol’s media APIs use 200 when media can be served directly and 307/308 if a redirect is necessary. Is it possible to model this in Dropshot such that it’s reflected in the generated OpenAPI document?
It seems like it probably isn’t, because each request handler function is permitted to have exactly 1 successful response code and several error codes (which must be in the 4XX and 5XX range, see
ErrorStatusCode), and each error code shares the same body structure. Looking at the API design,ApiEndpointResponsehas anOption<StatusCode>field,HttpCodedResponsehas aStatusCodeassociated constant, andHttpResponseErroronly has a method for getting the current status code of an error response value. Looking at the generated OpenAPI document for the custom-error example seems to support this conclusion. Am I missing something?Not 100% sure but I believe that’s true today, yes –
HttpCodedResponsehas aconststatus code.Modeling this is an interesting challenge – I can imagine accepting an enum with variants and then producing the corresponding sum type in the OpenAPI document.
Yeah, that’s the solution I came up with for a similar project I attempted that was more warp-shaped than axum-shaped; here’s an example. The derive macro generates metadata about the response variants and uses the variant names to decide the HTTP status code to use. Maybe useful as prior art, I dunno. The obvious downside is that
?won’t work in this situation without eitherTrystabilizing or defining a second function where?works and doing some mapping in the handler function from that function.re
?not working, you can always map a successful response into the corresponding enum variant explicitly, right?In the design I posted, there aren’t really “successful” or “unsuccessful” responses, just responses. Responses of all status codes go into the same enum, and each handler function directly returns such an enum, not wrapped in
Resultor anything. So if you want to use?to e.g. handle errors, you have to define a separate function that returns e.g.Resultwhich uses?internally, and then call that function in the handler function and then usematchor similar to convert theResult‘s inner values into your handler function’s enum.I believe that’s a long winded way of answering “yes”, but I thought I’d elaborate further to try to make it clearer what I was trying to say originally just in case.
Ah I see. What do you think of separating out successful and unsuccessful variants into a result type? 2xx and 3xx on one side, 4xx and 5xx on the other.
That could work. I think doing it that way could be more convenient for users because
?would be usable (though probably still require somemapandmap_err), but at the cost of some library-side complexity. Using a single enum obviates the need for categorizing status codes. Dropshot has already solved that problem with itsErrorStatusCode(but you’ll probably also need aSuccessStatusCode, which I’m not sure currently exists).Personally, I don’t think either way would make much difference for me as user. When working with HTTP libraries, I generally implement actual functionality in separate functions from request handlers anyway, so that such functions have no knowledge of any HTTP stuff. There are various reasons for this, the relevant one being that this minimizes the amount of code that has to actually care how the HTTP library is designed. Not everyone operates this way, though.
Something about the headline doesn’t quite click for me: is this a web framework? Namely, does it provide the web server loop as well? Or is it just related to the data model mapping between types and endpoints?
Yeah, it has the web server loop too. It’s maybe a bit too bare bones to be a “framework,” like it’s closer to a flask/sinatra than a Django/Rails. But it’s focused on “I want to produce a JSON API with OpenAPI” as a core use case.
Interesting!
Having done more asynchronous in TypeScript/JavaScript than rust I am used to the fact the call might do some amount of synchronous work before returning (or perhaps lie and not return a promise at all, or return a resolved promise, etc). You might need to code defensively, as the caller.
Is that meant not to happen at all in rust?
And where did the “lazy_futures” string come from? Is the the name of the crate and the default span of main in this case?
The usual line is “Rust futures do nothing until they are polled.” This is true, but misses the case described in the OP. An asynchronous function constructs a future at compile time, but a regular function returning a Future type can do work in advance of building the future. That’s entirely consistent with the rules, but is a nuance people new to asynchronous Rust can miss.
OK, that matches my intuition, so I guess I am fortunate.
Personally, I feel rust introduces a programming style where consuming code can rely on guarantees, and also forces you to check things you don’t know (even if the run-time check simply introduces a panic).
Here, the ecosystem relying on an entirely enchecked assumption that a function returning a impl Future doesn’t do anything before polling seems to be the opposite, YOLO style of programming. “Gee I sure hope you don’t free this pointer I’m lending you.”
I’m now wondering how to you could design it to enforce this assumption. The caller communicates with the executor, and has the callee be called by the executor? Hmm, if you did that it would probably be more syntactically obvious what is going on with an
awaitprefix keyword rather than a.awaitmethod-like suffix on a computed result (aFuture). (The way I understand the current design is it makes it easy to optimize away allocations for each.awaitsince the compiler can smoosh all the state into a single enum-like thing that the runs as a state machine upon polling - not sure if this is accurate, and whether similar optimizations could be made with a different design).Most of the time you would expect folks to be writing async functions, where this problem doesn’t exist.
Boats has described this as “registers of Rust” where writing async functions vs. writing functions returning futures are different registers.
That said, sometimes you have reason to move to a lower register, which is where this confusion can arise. At each lower register you have more control but also must take greater care. The compiler ensures you don’t introduce memory unsafety, but it doesn’t protect against issues like the one in the OP.
These are not necessarily different registers, but Boats’s definitions! His final register is “async/await syntax”, and the main way I write functions that return futures is to use async blocks inside the function. Just as easy as writing an async function, but you get more control.
It’s one of those things where while you can, it doesn’t mean that you should. Any work that may need to suspend the task will need to get access to a waker, which is only accessible when a future is being polled (rather than when the function is called). So the principle of least surprise implies that you shoulldn’t do any interesting work outside of the future.
Also, relying them being inert until polled enables useful features, like the tracing functionality.
It’s useful to make distinction between threads as a programming language abstraction vs threads as an OS feature. I am not entirely sure here, but I think my ideal world looks like this:
Though, I am not 100% sure that
threads > async/awaitas an abstraction:join(cat(), mouse()).awaitThis is basically how GHC haskell is
So kernel threads at the bottom and userland threads at the top but coupled using a not-thread API? I don’t know that that’s necessarily wrong but I wonder if it’s really better than just cutting out the middle… interface. Performance-wise I guess you get potentially multiple results per kernel<->application switch. But one could imagine an OS threading model that solves that problem, too.
Doesn’t seem difficult to have the same over a somewhat standard channel abstraction though? e.g. cat and mouse can spawn a thread and return a oneshot channel, or the thread itself can be a oneshot channel yielding the thread’s result (as Rust’s do).
It’s not about returning results, it’s about making sure that both
cat&mousewould have finished by the time we are done with the expression. The futures version is much simpler than the nurseries one: https://doc.rust-lang.org/stable/std/thread/fn.scope.htmlIt also does essentially none of the things scoped threads are for. If you just want to know the two threads have ended, here’s how it looks:
there you go, you know that both
aandbhave finished by the time you’re done with the expression.This requires opt-in. If you don’t call
join, concurrency leaks. The thing about structured concurrency is not joining when you want to, but ensuring that everything is appropriately joined even if you don’t think about it.So is joining futures?
That has to do with the concurrency APIs more than the concurrency styles.
No, if you don’t join the futures, you still maintain the invariant that nothing continues to run past the end of the expression (by virtue of the future not even starting before you attempt to join it).
The bug pointed out by smaddox is illustrative here. It totally flew past both me and you. But any version with futures that compiles would necessary either wait for both to finish, or cancel the other one.
I have real hot and cold feelings towards structured concurrency/automatic cancellation. It’s often really useful but many times introduces bugs of its own.
For example, with a
selectortry_joinovermpsc::Sender::send, if the future is cancelled then the message is lost forever. This didn’t used to be clear in the documentation – I updated it to be clearer and suggest usingreservein those cases instead.More generally, I think cancellation introduces spooky action at a distance in a way that can be quite unpleasant, especially for write or orchestration operations.
I think it’s not cancelation in general, but Rust specific way of doing it. Super low confidence here, but I think the sweat spot is roughly this: https://github.com/ziglang/zig/issues/5913#issuecomment-2075008648
You absolutely need implicit context passed everywhere. Cancellation tokens will not be passed cleanly and in the same way as I think structured concurrency must be impossible to misused (eg: automatically attach parent) i also think nothing must accidentally lose the cancellation token.
Python’s
TaskGroupis incredibly easy to get frustrated by because many libraries in the python ecosystem cannot cancel properly. For instance if you use iofiles to read from a pipe but you also have a task that fails, then the task group locks up forever (or until someone closes the pipe/writes into it). You will not see the error because it’s deferred until the cancellation succeeds.Except if the first join call returns an error, then you never make the other join call… So it doesn’t ensure both threads have finished. And, as pointed out by matklad, calling this isn’t enforced.
For context, this is what go does*, and the ergonomics get awkward. Eg: if you want to use any kind of non-channel construct (eg: an errgroup, or context), the construct has to expose a channel, which can get pretty cumbersome.
Others have elaborated on the Rust approach, but if you’re curious about where it comes from, it’s worth looking up Concurrent ML.
* Just on the off chance anyone hasn’t experienced Go.
This is very interesting. I have solved this problem a different way.
Instead of firing
NOTIFYfor the data directly, I write a row into a queue table with table, row metadata. The insert into the queue firesNOTIFY, and the listeners can look up the correct record from the metadata. Every message has a monotonic sequence number, and listeners track the newest sequence they’ve seen. When they start up, they select all rows newer than the latest. Poor-mans Kafka, basically.This is a cool pattern!
Presumably listeners store their most-recently-seen sequence in Postgres too?
There’s more to the picture, but I’m leaving stuff out for brevity. But yes, you should persist the monotonic sequence. This value will essentially become part of a vector clock once this grows to multiple systems.
We do this exact thing all the time, works great.
How do you deal with downtime of all listeners in combination with ill-ordered sequence numbers due to failed transactions?
Basically a listener could read number 2, then go down - then number 1 is being written (because the transaction took longer than for number 2) and the notify does not reach the listener because it’s down, then the listener starts up again, reads from 2 and gets notifications for 3 and 4 etc but misses 1.
Okay, I’ll paint a more detailed picture. The goal is to replicate data changes from one system to another in real-time. They live in entirely different datacenters.
System 1 A traditional, Postgres-backed UI application
System 1 (restart)
System 2 An in-memory policy engine with a dedicated PostgreSQL DB
System 2 (restart) An in-memory dataset means a restart is a blank slate.
Because the sequence is atomically incremented by Postgres, this means that a failed transaction results in a discarded sequence. That’s fine.
We are using DB tables as durable buffers so that we don’t lose messages during a system restart or crash.
If there is a race condition where a record is updated again before the previous event is published, that’s also fine because the event is idempotent so whatever changes have taken place will be serialized.
We use an ordered pubsub topic on GCP to reduce the possibility of out-of-order events on the consuming side, and we enforce monotonic updates at the DB level.
First of all, thanks a lot for the detailed reply. I appreciate!
I think you actually currently have a potentially race-condition (or “transaction-condition) bug in your application. You are saying:
It is atomically incremented, but it is not incremented in order. I made a mistake by mentioning “failed transactions” but what I actually meant was a long-lasting transaction. So indeed, a discarded sequence number does not matter, I agree with you on that. But what does matter is if a transaction takes a long time. Let’s say 1 hour just for fun even if it’s unrealistic.
In that case, when the transaction starts the sequence number is picked (number 1). Another transaction starts and picks 2. The second transaction finishes and commits. The trigger runs, the notify goes out and the listener picks it up and writes it somewhere. It remembers (and persists?) the number 1. NOW the listener goes down. Alternatively, it just has a temporary network issue. During that time, the second transaction finally commits. It triggers the notify, but the listener does not get it. Now the listener starts (or the network issue is resolved). A third transaction commits with number 3. The listener receives the notify with number 3. It checks it’s last transaction number which is 2, so it is happy to just read in 3. Transaction 1 and its event will now be forgotten.
Possible solution that I can see: if skipped sequence numbers are detected, they need to be kept track of and need to be retried for a certain time period T. Also, on restart, the listener needs to read events from the past for a certain time period T. This T must be longer than a transaction could possibly take.
Alternatively: on a notify, don’t query by the meta data from the notify. Use timestamps and always query by
timestamp > now() - T. To improve efficieny, keep track of the sequence numbers processed during the last query and add them to the queryAND sequence_number not in [...processed sequence numbers].There also seems to be a way to work with transaction IDs directly and detect such cases but this is really relying on postgres internals then.
I had a similar issue with a log (basically kafka for lazy folk who like postgres), and ended up adding annotating log entries with the current transaction Id, and on each notification, querying for all rows whose transaction IDs were smaller than that of the longest-running open transaction. I wrote it up too.
Oh, that’s very interesting. Don’t you have to handle txid wraparound in that case?
Yeah, you’re right, but the functions I mention return a 64-bit value (transaction Id and an epoch), and I was pretty happy that it’d take several million years for that to wrap around. At which point it would be someone else’s problem (if at all).
Thanks for going into more detail. I don’t think the scenario you’re describing is actually a problem for us, I’ll bring you even further into the weeds and let’s see if you agree.
The event queue is populated by a trigger than runs AFTER INSERT OR UPDATE OR DELETE, which detects changes in certain data we care about and INSERTs a row into the event queue table.
The INSERT triggers the sequence increment, because this is the default value of the
seqcolumn.So, the only sort of race condition that I could think of is multiple changes hit the table for the same table row in rapid succession, but remember that the publisher emits idempotent event payloads. So we might end up with multiple events with the same data… but we don’t care, as long as it’s fresh.
The ordered deliver from Google pubsub means that rapid updates for one row will most likely arrive in order, but even if not we enforce monotonic updates on the receiving end. So in the unlikely event that we do process events for a row out-of-order, we just drop the first event and use the newer one.
The SQL statement (including the trigger and everything it does) runs within the same transaction - and it has to, so that either the insert/update/delete happens AND the insert into the event queue table happens, or none of it happens.
So then, the problem with a “hanging” transaction is possible (depending on how your queries look like and come in I suppose) and that means there will be points in time where a listener will first see an event with number X added and then, afterwards(!), will see an event with number X-1 or X-2 added. Unless I somehow misunderstand how you generate the number in the seq-column, but this is just a serial right?
So if you have only one producer and your producer basically only runs one transaction at a time before starting a new one, then the problem does not exist because there are no concurrent transactions. So if one transaction were hanging, nothing would be written anymore. If this is your setup, then you indeed don’t have the problem. It would only arise if there are e.g. two producers writing things and whatever they write increases the same serial (even if they write completely different event types and even insert into different tables). I kind of assumed you would have concurrent writes potentially.
Also see the response and the link that noncrap just made here - I think it explains it quite well. From his post:
What you’re describing here appears to contradict PostgreSQL’s documentation on the READ COMMITTED isolation level:
It is certainly possible that the documentation does not accurately reflect reality, so let’s look at Kyle Kingsbury’s analysis of PG:
So as far as I can tell, my design will work exactly as intended.
I guess I’m misunderstanding your setting. I was thinking you have multiple processes writing into one table A (insert or update or delete) potentially in parallel / concurrently, then you have a trigger that writes an event into table B. And then you have one or more listener processes that read from table B.
If you remove the “parallel / concurrently” from the whole setting, then the problem I mentioned does not exist. But that is at the cost of performance / availability. So if you lock the whole table A for writes when making an update, then the situation I’m talking about will not happen. But that is only if you lock the whole table! Otherwise, it will only be true for each row - and that is what the documentation you quoted is talking about. It’s talking about the “target row” and not “the target table”.
Yes, that is correct. The publisher LISTENs to Table B for new events, and the event payload indicates what data has been mutated. It queries the necessary data to generate an idempotent payload which is published to pubsub. So if there are multiple, overlapping transactions writing the same row, READ COMMITTED will prevent stale reads. There is no need to lock entire tables, standard transaction isolation is all we need here.
If we were to introduce multiple publisher processes, with multiple LISTEN subscriptions, then we would need to lock the event rows as they are being processed and query with SKIP LOCKED.
I learned a few things here but hitting the PostgreSQL parameter limit has a “you’re holding it wrong” code smell.
100% agreed. Also, if you really need that big of an
id IN (1, 2, 3, ...)expression, you can useid = ANY(array[1, 2, 3, ...])where the array is a single parameter in which you stuff all your ids.The actual description on the page (not in the linked issue) was:
It’s easy for me to imagine hitting this when initially importing a photo library.
Honestly I’m not sure why it’s a smell. It feels like a circular logic that it’s a smell just because postgres doesn’t support it well - but that’s a postgres problem then…
The idea that you can pass an unlimited CSV that needs serialising and parsing again, but can’t do the same with native parameters seems backwards.
You can, if you use an array of struct types as a parameter.
And it’s probably a smell, beause how could you easily debug such a query?
Debug it on a small selection of values? It’s easier to do than debugging a query that used a temporary table which doesn’t exist anymore but the time you get the error report. At least the query included all the data.
None of this addresses any of the fundamental design flaws in protobufs: https://reasonablypolymorphic.com/blog/protos-are-wrong/
We’ve suffered thru 5 years of protobuf at my work and not only would I ditch it if I could, I’d strongly advise anyone considering it to stay away.
I’m not very convinced by that article. There are reasons besides amateurism to design an IDL with a restrictive type system. When you want bindings in many different languages, you need to be thinking about their least common denominator. A feature like user defined generic types would be challenging to support across all languages in a consistent way.
Eh this is just a horrible article, and shouldn’t be re-circulated
It’s certainly valid to say that protobufs are not appropriate for a use case, or even not appropriate outside Google except in limited circumstances (e.g. if you don’t have a monorepo)
But he’s missing the whole point, and hasn’t worked on anything nontrivial that uses protobufs
I would go look at what this person has actually built, and then go look at what people have built with protobufs. (I didn’t actually go look, but I can already tell by the way this article is written)
He doesn’t know what he doesn’t know
iirc OP literally worked at Google, so i feel like dismissing the criticism as one that comes from lack of exposure probably isn’t reasonable.
what would advise to use instead? if anything
We use JSON-RPC ( https://www.jsonrpc.org/specification ) - which is really just a loose standard around formatting an RPC call over the wire as JSON. We don’t have any regrets going with that over gRPC (which was the only real other alternative when we picked it). I like that it covers virtually all use cases you’d need, is easy to hand-write when necessary, and (with a bit of TypeScript or JSDoc) can give you most of the benefits that you get immediately without moving into the realm of diminishing returns.
Cap’n Proto is supposed to be a huge improvement over protobufs, though I haven’t used it myself:
https://capnproto.org/
FlatBuffers seem popular as a replacement. I’ve not used them but some teams 8 worked with moved to them and found that they were fast and simple to use.
If connectrpc supported flatbuffers, I would definitely use that instead of protos.
I’ve meant to write a rebuttal to this for years, but I think it’s a question of framing. If you frame it about as describing a language (ie: a grammar for serialising a particular type) then it makes a little more sense. The whole issue is that he’s trying to cram an ADT style worldview (which is relatively tightly coupled, but that’s okay because it’s in-process) into a distributed world, where versioning is a problem.
I definitely prefer it over openapi’s type syntax/system.
I appreciate this comment, when I started reading the article, I thought “I wonder what technomancy would say about gRPC”.
TBH the biggest problem with gRPC is A) protobufs are bad and B) most people conflate gRPC and protobuf without realizing you can replace the encoding with a better one.
I honestly don’t have a problem with gRPC on its own, provided it’s used with a good encoding.
The main mistake of gRPC is its use of HTTP trailers which blunders into a huge pile of nasty interop problems.
I’m on the other end of this (been using gRPC and loving it – mainly the end-to-end type safety), what made it suck in your experience, and what alternatives did you look at that might be better?
I wonder whether anyone has tried to fold system load into the concept of time. I.e. time flows slower if the system is under load from other requests.
It sounds like you’re looking for queue management algorithms, such as CoDel, PIE or BBR.
The atomic stop bool in the code example is a bit of a distraction. In the example, the author sets the stop flag and then immediately terminates the runtime. You cannot expect the stop semantics to work if you don’t wait for the spawned tasks to complete. There is no point to that. So, if we think the stop flag away, what remains is the insight that spawned tasks can be terminated in any order which does not feel as much as a footgun to me.
The point of the stop flag is to actually show what kind of (real, non-abstract) thing could break as a result of the unspecified ordering, which is indeed the only bold sentence on the page:
If you wrote the same code without the stop flag, you’d probably handle this correctly by intuition, writing each task in awareness that it is just sitting there in a big sea of runtime, hoping someone else can hear them — if not, when you hit the interrupt, you’ll likely find out.
But if you design it with the stop flag first, you might be thinking and planning more for the ‘expected’ shutdown sequence, and therefore neglect the case where the runtime is dropped instead, causing a bit of an unscheduled disassembly.
So I think that insight is the only one the page is trying to convey, but the stop flag isn’t a distraction, it’s motivating that insight.
The reason the stop flag does not work is because the runtime is shut down before it can even do its work. That’s the reason it’s broken, doesn’t really have anything to do with the order of when tasks are dropped. If the authors suggested solution (to drop everything after stopping all the tasks) were applied here then it would happen to solve this specific manifestation of the underlying bug, but the bug remains. If the receiver task for example would have some other cleanup code inside the if statement then that may never execute even though apparently that is what the author intended.
True–but there’s no actual explicit shutdown sequence present. In order for the tasks to shutdown cleanly, the author should be waiting for tasks to clean themselves up once signalled (and you’d need a mechanism that would actually notify the task, too).
It seems that they’re assuming that the runtime will run the tasks to completion; which could block indefinitely, and just as non-obvious.
I wonder if that is their experience? I’ve had so many quirks with tokio getting to the end of my async main and then just… stalling. I don’t know why, I assume some task is being blocked, possibly by an I/O operation. It’s always whenever I’m doing something with networking and don’t take the absolutely utmost care to isolate and control the connection to make the async runtime happy. But in some of my prototypes I had to literally put a
std::process::exit()at the end of my asyncmainbecause it would get to that point and then just keep running. I never figured out exactly what causes it, it just happens.I feel like a lot of received wisdom in software is essentially folk knowledge, where folk don’t necesarrily go out seeking expertise on how to use tools. And in a way this is a cunning way to handle that problem.
The paper Mock Roles not Objects makes the point that by getting folks to focus on the relationships and roles that different objects take, you can end up with more minimal, and stable interfaces, and thus more stable tests. Wheras the common usage is to use mock objects to stub out an arbitrary object, which may be more chatty than it needs to be, and so your tests end up being more brittle.
Fakes are a great way to handle that, as you can embed the model of how the real component works, so that not everyone has to remember the intriciacies and subtleties of a chatty protocol.
For software in general this does ring true to me. But for testing in particular, my view is almost the opposite: there’s actually very little good information on how to do it, and an abundance of outright bad advice.
For example, it is absolutely critical to think in terms of size/scope/extent dimensions of test. But among all the testing advice I’ve read over the years, this is only articulated with sufficient clarity in one book, SWE@G. Almost everyone thinks in terms of unit and integration tests, which hopelessly muddle important and inconsequential dimensions of tests together.
Or, to use the example on hand, I’d classify the “Mock Role not Objects” paper as snake oil, on two grounds.
First, although mocks perhaps can be useful in select circumstances, by a large, “don’t use mocks” is the correct advice to give directionally. Mock-promoting paper which doesn’t say “actually, you should almost never use this” is bad advice. In particular, no, their worked example doesn’t require mocks and is harmed by their introduction. To test timed caches you need to a) introduce fake time b) hock up logging/metrics with your test to assert hit rate.
Second, the paper treats Clock and Reload policy on equal ground, failing to make the absolutely critical distinction that the time dependency is a different kind of thing. Time dependency is what increases that all-important size metric of the test. Failing to note and to communicate that time is really special is, again, bad advice.
I whole-heartedly agree on this one. Most folks I’ve met seem to take a naive-empiracist approach, which without critical examination of the results, does lead to quite poor results.
Do you know of any articles on this? I think I have feel for what you mean, but i’d love to see a fully fleshed out take on this. It’d help me to understand you later argument, that depends on the notion of a distinct size metric.
I think the thing to bear in mind here is that it’s a paper presented at a research conference, and so a specific description of a particular technique, rather than being a general guide to how to write tests.
The main point of the paper is that mocks can work really well when you’re using them to help define how the clients want to use a dependency, before you commit to implementing that dependency.
My point is, that using mock objects well requires a change in approach and thinking, rather than just adopting a tool. If you don’t approach them with a curiosity about their affordances, and consider how they might best be used, then yeah, they’re going to be a pain.
For example, if you have a large, chatty interface, and you end up concludiing that the mocks are painful in that case, then great! If you can then ask, why is this painful, and how do I make it less painful, then wonderful.
It does occasionally seem like we have a generation of carpenters trying hammers, hitting their thumbs a few times, and deciding to just use glue and screws for evermore. But hell, if that works for you, great!
I mean, i can see it in terms of how we generally consider spacial dimensions as qualitatively different from the time dimension, and yeah, there’s definitely ways you could express the example in §3.5 more clearly.
It’s also worth bearing in mind that the paper is 20 years old, and we’ve learnt a bunch in that time. Eg: they use
expect()when querying the policy or time source, when you should be able to tell it works just from the result, in this case.Poor, or outdated examples doesn’t necesarrily invalidate the notion of focussing on roles/relationships, and need driven development though.
Still curious as to what you mean by this though.
On size/extent/scope
The article linked from OP: https://abseil.io/resources/swe-book/html/ch11.html#test_size
My more compressed summary: https://matklad.github.io/2022/07/04/unit-and-integration-tests.html
This mirrors the common practice in statically typed functional languages e.g. Haskell. You write types upfront to stub out the domain and guide the implementation, then keep iterating types/functions until you get what you want. In a way the strong IO/pure separation and static type system removes the need for mock tests entirely.
This is definitely a wierd take. Having tried to learn haskell (but not quite had the patience) I’m sure the author is entirely correct about the benefits. However, it doesn’t really consider that these tools are situated in a socio-technical system, and you have to consider the predudices and habits of well, your typical developer. And for most people, they’d likely consider that learning purescript/halogen seems like too much hard work for little reward.
I mean, I can understand the post if the developer was being forced to write React/typescript under duress, and was just having a shitty day. I know all too well what it’s like to work with what feel like under-powered tools.
Is it technically better at everything? Maybe. Is it better in the context of existing software developers? Maybe not.
ah yes, the mythical “average developer”.
Well, speaking as a distinctly average developer myself, what of it?
Well, when we’re using average developers as arguments, let’s not confuse ‘what technology the average developer already knows’ with ‘what technology leads to the best outcomes when average developers must learn and use it’.
Yes! That’s exactly my point, that what’s “best” is contextual and will vary according to circumstances.
Ya, very true, what’s ‘best’ definitely varies across groups and people. I was trying to make a slightly different point, though: I think that ‘What technology do people already know?’ is an unreliable rule of thumb that overvalues the time cost of learning a new technology.
Many things are built on unsuitable technologies or ad-hoc invent-as-you-go frameworks, merely because “spend time up front learning language/library X that solves this” was taken off the table as an option. And that’s why I object to “what to people already know” as a major decision factor.
To digress a bit: Often, it’s not the time cost of learning that’s unaffordable, but the social cost. Many technologies would repay the time cost of learning, even if the entire team or department or every new dev has to learn — but the social cost to spending time learning means it is not attempted. Cat Hicks has done good research (real research! she’s a social scientist!) on this; the whitepaper Coding in the Dark.
Final digression, a quote from Coding in the Dark:
“Crunchy peanut butter is better than creamy” you claim, but you fail to consider the context of people with peanut allergies. What a silly opinion on peanut butter you have!
Over the past couple of years I’ve started believing that “test code” is fundamentally different from “prod code” in ways that make good prod idioms not applying. Another one: Make Illegal States Unrepresentable is a really bad idea in formal specifications.
I would love to learn more about your last sentence!
I think @hwayne has talked about the idea of maybe some invalid states being useful here: https://lobste.rs/s/e2sm00/constructive_vs_predicative_data.
Although I would guess it’s a similar idea to why a web form can carry data that wouldn’t be considered valid by the backend (or even by the HTML5 validation attributes). It’s a lot easier to report on what’s actually wrong if you can represent those invalid states. So my feeling is that in a spec, it makes it easier to make assertions about what makes a state valid, but also make it easier to see how it’s wrong.
This also reminds me of Postel’s Law (be conservative in what you do, be liberal in what you accept from others).
It’s basically what @crstry said: the point of writing a spec to make sure your system doesn’t enter a bad state, but in order to know that you have to be able to represent the bad state you’re trying to prevent.
Like, imagine you’re trying to prevent cycles in a dependency tree. If your specification makes cycles unrepresentable, you can’t model the case where someone hand-codes a cycle in
package.json.Wouldn’t the job of the formal specification be complete by having the illegal states be unrepresentable? You can assert there are no cycles either through the formal specification or through the data structure that the formal specification is proving. Either way, the job is done.
Yeah, making states in the “machine” unrepresentable is a good idea, but you still want to test that the unrepresentable states cover all the invalid states. So you still want to make them representable in the overall spec.
This is a place where refinement is real useful.
AbstractSpechas representable but invalid states,ImplSpectries to make them unrepresentable, proving thatImplSpec => AbstractSpecmeans that you succeeded.I was hoping the discussion here would be about the title, and not the technological details of why Tailwind is better than standards-based CSS.
But now that we’re here, the developers who promote Tailwind need to explain to me (in simple terms) why I should write my next HTML button with this syntax taken directly from Catalyst — the latest and greatest of Tailwind:
Marketing is off-topic here. I’d have removed this link if I’d seen it earlier, but there’s some substantial comments and it seems unlikely the discussion is going to go off-topic or into a flamewar.
Gotcha. But why is marketing off-topic? I think the way open-source projects are described and how they talk about others is important.
It’s described longer here. Speaking of marketing, you need to stop exploiting Lobsters as a marketing channel. Stop submitting your own work until it’s less than a quarter of your site activity.
Thanks for the link. I now get it. Thought this would fit into the scope of Lobsters.
It might have been in scope if your other seven stories weren’t exclusively advertising your products and all 67 of your comments weren’t all in your own marketing posts. The threshold for “in scope” is lower if it’s clear you’re posting in good faith, and so far, you are not.
So if I create another Nue-related open-source project I cannot post and discuss it here?
I’m a user, not a mod, so I can’t tell you what you can and can’t do. I’ll just point you to what @pushcx said:
You are being encouraged to participate in topics posted by others to counterbalance your participation in topics that are posted by you about your own projects and products.
I get that, but that’s just not how I work. I tend to isolate myself inside my coding chamber and then come back with a thing that I want to get feedback on. During my time of development, I usually stay away from social media platforms like X, Hacker News, or Lobsters.
If you don’t want to participate normally in the community, no, you may not use Lobsters as a write-only marketing channel.
Okay. I’ll start behaving “normally” then and become more socially active
I’ve never developed with Tailwind, but it is clear to me that the Catalyst project does not suggest that you write your next HTML button with that syntax. The Catalyst docs for Button show the actual source code a developer would write to get that button:
The HTML you copied is just the generated output of a build step. Developers aren’t meant to work directly with that generated output, so how messy it looks doesn’t matter. (Unless you’re arguing it causes performance problems, in which case I’d need to see a benchmark against equivalent CSS to be convinced it’s significant.)
And answering your follow-up question from the article:
It’s the same component as the above, but with an
outlineprop:<Button outline>Button</Button>.As @adamshaylor noted, Tailwind’s docs recommend that you extract combinations of reused CSS and reused HTML as components. In other words, Tailwind assumes that you already have a component or template partial system in your front-end. I think that’s your fundamental misunderstanding about Tailwind. Catalyst is just an example set of such components.
The assumption that a “component” abstraction is already available helps explain why Tailwind recommends using its own method of abstraction,
@apply, sparingly.For more analysis of the connection between Tailwind and front-end components, including exploration of component-related solutions that aren’t Tailwind, I recommend MrJohz’s excellent comment on Hacker News:
I’m half wondering if/when you’re gong to challenge Adam Wathan to a duel.
More seriously, this clearly reads as a demand to “debate me!” which rarely goes down well, I find.
It feels to me that this piece comes more from a place of vitriol, with some post-hoc reasoning thrown in. Eg:
This might be true, but I can’t see that there’s a whole lot in the piece (eg: references) that helps us see where the author is coming from.
Ultimately, there’s nothing making you use tailwind (beyond, perhaps, a job–but such is life). Sure, there’s a whole bunch of asinine tech nonsense out in the world–but I don’t think this will persuade anyone who doesn’t already think along these lines.
With all due respect, this blog is much more subjective and emotion-based than you might think. You barely have contradicted Adam’s points, you only contradict the marketing. Also, the title is just clickbait and outright attacking - is this really how you want to present your ideas?
I think Adam’s first blog post is still a very good read, let me quote this part:
This is an important, objective idea, and is always missing in these discussions. I personally think that the CSS Zen Garden approach is cool, but is simply not how any project I have ever worked on works. You don’t change company look by a CSS stylesheet, besides some basic stuff.
The other direction is very commonly applied, just often not knowingly. I wager that you yourself even work in something like this - this is not tailwind specific. Simply by explicitly thinking this way about the problem helps tremendously - I don’t think there is any silver bullet within how you achieve this, you either have to have good namespacing (local CSS with custom HTML components works just as well), or use something like tailwind (you can again, just use HTML components to reuse code - you ideally only have in the code base, not , at least for something more complex).
I wager why many people like tailwind is that it removes the non-local property of CSS, which let’s be honest, is just a huge source of issues. “Cascading” is often more harmful than useful. Tailwind (and many other tools) that simply reset styling can make maintenance of large code bases much easier (your single line CSS change for this site won’t break some button n levels deep in some completely unrelated site).
I have written another article about the technical differences of Tailwind and semantic CSS:
https://nuejs.org/blog/tailwind-vs-semantic-css/
This is mostly about their messaging.
Maybe, but their messaging mostly talks about the practicality of this vs. other approaches, and indeed, your article makes other claims about the practicalities; so it’s pretty disingenuous to claim that it’s purely about messaging without touching on the substance.
Eg: with this article, you talk about the amount of generated CSS–bear in mind that it’s explicitly designed to be used with dead code elimination. Eg: on one of my projects, that it reduces down to about 5kb, most of which is CSS normalisation.
And your argument about the markup maybe reasonable, but it’s kind of meaningless without talking about overall impact. Eg: does it have negative performance impact? To what extent? Without that, it just comes across as “Hey, look how ugly this is!”.
Never mind the claim that they discourage using
@applyto build more semantic classes, when it’s an example on the front page.Also, you seem to predicate most of the argument that the reader who uses CSS just isn’t actually very good at using CSS. And even taken at face value, this can be read as tailwind meeting it’s customers where they are. Unlike the Neu approach, which comes off as kind of exclusionary. I mean, i’m all for encouraging folks to get better with their tools, but this isn’t the way to do it.
are you saying that “subjective = bad, objective = good” and everyone should blog “objectively”?
Something very similar was the original idea that led to Scheme. In the terminology of the time: everything should be an actor. As the initial article states:
While working on this, it led to the usage of continuations as a fundamental primitive, and using continuation-passing-style in the compiler to hide this from the end user. This would allow for “actors” to suspend themselves and pass control on to another.
Now, actors and coroutines aren’t exactly the same, but they are similar enough and one can be used to implement the other. The article call-with-current-continuation patterns by Ferguson and Deugo has a section specifically about how to implement coroutines using continuations.
Suspension of an actor (or coroutine) can also forcibly happen after some time has passed, which is a natural way to implement green threads (we do this in CHICKEN).
I’ve seen a similar idea show up in responses to this post in a few other places it’s been shared: that this coroutines idea is just rediscovering the actor model. And it is, in a way. What interests me is not so much the idea of coroutines-as-basic-primitive (which is, after all, less powerful than call/cc, and morally equivalent to delimitied continuations). What interests me more is the sequence type system, its similarity to state machines, and the structure and guarantees it adds to coroutines. call/cc is often considered too powerful for its own good, much like goto, and a type system like this could make coroutine-based programming more practical.
I think you’re re-discovering “extensible effects”: https://okmij.org/ftp/Haskell/extensible/more.pdf
I really Concurrent ML (ie: mostly first class, composable “operations”) as a model for concurrency, especially cases that don’t easily fit the one process one mailbox model of actors. As a taster, I really enjoyed this article building an implementation from scratch: https://lobste.rs/s/jfxp39/new_concurrent_ml_2017
That’s a fun issue. FreeBSD does document that
nbytescan’t be larger thanINT_MAXand doing otherwise will EINVAL. Oddly enoughwrite(2)apparently relaxed this 10 years ago, but has adebug.iosize_max_clampioctl to restrict it back, either theread(2)manpage is wrong or it skipped that relaxation. It might be wrong, looking at the rust stdlib it hit that issue back in 2016 (#38590) and added a special case for macOS, this special case is still in there and still only for macOS.MacOS does not document this limit for
read(2)but it might be inferrable fromreadv(2)stating:OpenBSD explicitly documents a limit of SSIZE_MAX, netbsd is more implicit but apparently makes similar guarantees:
I suspect that most things that want to read that much actually want to mmap the file. Maybe reading 1 GiB from the network at a time may make sense for very high-throughput activities, but by the time you’re doing 1 GiB the overhead of the system call is amortised to nothing and doing multiple reads in userspace will be unlikely to make a measurable difference (really curious if anyone has counterexamples!).
Memory mapping files is scary. When using normal I/O operations, I/O errors are communicated as return values from function calls, which is easy to deal with. When memory mapping the file, I/O errors are communicated by the kernel sending your process a SIGBUS signal, and that might happen at literally any memory read or write operation in the mapped region.
In a lot of contexts, reading the whole into memory at the start of the program will be the right approach. Computers have loads of RAM now, 1GiB is hardly an unreasonably large file to store in memory any more.
Of course, reading the file into memory in chunks smaller than 1GiB is perfectly fine and won’t cause performance issues, people probably aren’t reading files in one go due to performance but because this foot-gun is really non-obvious. “Get the size of the file, then malloc that amount of memory and read() that amount of data” is how the API looks like it wants to be used.
At my previous job, we had a program that would
mmap()(read-only) it’s data rather than usingread(). In the 13 years I was at the company, not once was there ever an issue with IO using the scheme, and the programs would run 24/7 serving requests. Also, it allowed better memory utilization as multiple instances of the program would be running, and they all could share the same pages of memory.This might be one of those things where if you start to notice issues from an
mmap(2)ed memory region, you’ve probably got bigger issues at hand, as exactly the same mechanism for demand-paging in executables as you do for reading your data file. Naturally, that’s more complicated if you actually have different devices for data vs. control plane work.Although unless you have a sophisticated storage subsystem, I suspect (I’ve not done sysadmin, much less DBA work for a long time) that the best thing to do is replace the disks and rebuild the node/volumes anyway.
The problem though is, other programs can modify the file while it’s mapped. A user’s editor could truncate it and write new content to it, for example.
True. I was assuming things like DB datafiles which shouldn’t be touched with an editor.
After playing a bit in a VM, unless I made a mistake somewhere it looks a lot like FreeBSD’s read(2) was updated, but the manpage was missed. I can’t exclude having screwed up the test program, but it seems to happily read or write 4GB buffers from and to files (and does indeed report having read / written 4GB).
Did you try a 32-bit program? I think, on 64-bit kernels, 32-bit programs are able to map almost the entire 4 GiB address space and so should be able to allocate a 3 GiB buffer and read into it. That seems like the kind of thing that might not work (having objects that are more than half the size of the address space has odd interactions with C because you can’t represent offsets within them in a ptrdiff_t. It probably works, given that the 32-bit compat layer uses the same kern_read function and should just zero extend the arguments.
Why would I? It’s not relevant and I don’t care about it.
The behaviours freebsd documents are that IO operations are limited to sizes of either
INT_MAXorSSIZE_MAX. On 32b platforms they’re the same thing, that is useless to the question I’m asking, which is whetherread(2)is restricted toINT_MAXstill, or it’s limited toSSIZE_MAXand the manpage is outdated.The answer is the latter by the by.
You can’t read 2GB or more with
read(2)on a 32 bit system because the return value is signed and negative values are errors. I suppose, strictly speaking, only -1 is an error, but I know I am careless about that detail in my code!C’s mismatch between
size_tandptrdiff_t, and POSIXssize_tare one of those things that’s basically broken, and would be causing failures all the time, except we moved to 64 bits before it really started hurting.Is there no way for userspace multitasking to be preemptive? That is, at least on Linux, are threads the lightest weight primitive of preemptive scheduling?
Preemptive multitasking ultimately requires a way to interrupt the CPU which is executing some arbitrary code, without requiring cooperation. In the OS kernel usually some kind of timer interrupt fires routinely to mark the end of a scheduling slot, preserves the task that was executing at the time, and puts another in its place. This happens periodically, switching tasks back and forth so they all get some time.
In a UNIX process you don’t have interrupts, but you do actually have signals. Signals are not actually that different from hardware interrupts: you can mask their delivery in critical sections, they can be observed to occur in between any two instructions (they’re completely asynchronous), they nest, and they preserve the register state of the interrupted task. Many signals sent on UNIX systems are process directed and sent from another process to the victim process. They don’t have to be though! UNIX systems generally offer a thread directed signalling system call which can send signals to a specific thread within the same process. There are also signalling mechanisms that can send a signal repeatedly based on a timer.
With either timers set to deliver signals, or a control thread that sleeps and wakes and uses thread directed signals to interrupt other threads, you can achieve a form of preemptive multitasking entirely in user mode. You can interrupt another thread that’s hogging the CPU, preserve its register state (the signal handler is given a ucontext_t which contains this and other critical state), and switch another previously preserved task onto the thread just like an operating system would using system calls like setcontext() and setcontext().
In general I’m not sure it’s valuable to do this. It’s complex, and would be hard to get right (signals are, like interrupts, notoriously difficult to reason about). The kernel generally has more and better information, and cheaper tools to act on that information with respect to context switching.
If anyone is looking for an example, it looks like the Go runtime (as of 1.14.0) has a control thread which sends a signal to other threads for preemption.
Thank you for your educational response! 🙏🏾 TIL that signals can nest.
Threads are the golden path for preemptive multitasking; simple and robust. But do threads have higher overhead than above scheme? That is, we have greenthreads / M:N schedulers / etc. because threads can’t be packed as dense as these userspace context switching mechanisms. But, to the best of my knowledge, they’re all cooperative multitasking mechanisms.
(Ab)using signals seems like a sick hack; but, I suppose my real question is do kernels offer similar golden path for preemptive multitasking?
In the pre-emptive user-space threads case, the over head of kernel-space pre-emptive threads is a large in-kernel struct and other kernel-level data structures associated with the thread.
In user-space concurrency, you can eliminate two other big overheads by avoiding pre-emption and “stackful” threads. Cooperative threading requires saving much less execution context than pre-emptive threading. Stackless concurrency also frequently requires much less context, i.e. you don’t need to save the entire stack, just the explicit context associated with each coroutine (but you have to be careful with heap fragmentation here).
None of these matter when you only have a relatively small amount of threads but when you get into 10^5 or 10^6 region, it starts to matter.
On the flip side of that, a context switch in the kernel scheduler is (usually) a lot cheaper than signal delivery. Signals require tens of thousands of cycles or work on typical *NIX systems. The main benefit of userspace scheduling is that you can switch between userspace threads cooperatively much more cheaply.
There’s an annoying property for scheduling that cooperative scheduling has much better performance when everyone is playing nicely but catastrophic performance when they are not. In contrast, preemptive scheduling has far better worst-case performance than cooperative, but is less good when everyone is playing by the rules. The ideal model is probably a cooperative threading engine with preemptive fallback. The thing I really want for this in a desktop OS is a way of controlling the next timer interrupt from userspace and getting the signal only if it expires (and, similarly, another hardware timer for the kernel to control to trigger switches to other processes). This would let me a userspace scheduler keep pushing the signal horizon forwards as long as it’s context switching fast enough, but as soon as one thread hogs the CPU, it gets preempted.
In a user space threading system I wrote, I did this for exactly these efficiency reasons. I realized late in the process that pre-emption had significant negative efficiency implications.
This is actually an area where OpenBSD and other libc-dependent systems like macOS have an advantage over Linux. Since system calls must pass through a C function call interface, you don’t have to save as much context during the system call. It could be nearly as fast as a cooperative user space swap (modulo cost of entering and leaving kernel mode, which is cheap, or at least was pre-PTI). I’m fairly certain they currently just save the entire user space execution context as during pre-emption, for simplicity’s sake.
Even not doing that, the scheduler should already be able to reset the timer on context switch but I think for, again, simplicity’s sake it always just runs at 100hz. Linux does something smarter here now in this vein called “tickless” mode.
While libcurl(3) is well-documented, I find curl(1)’s documentation to be a huge dump of detail and difficult to read:
I don’t think length is necessarily the problem. Personally I find
curl(1)(andbash(1), a similarly long man page) to be very legible; but it helps to have internalised some rules for jumping to particular sub-sections.I very rarely use curl, so have never internalised their commands, it every time I use it I am able to quickly find what I want the answer in the man page for what I want to do with a quick search for a relevant term.
It sounds like you’re looking for a more introductory or tutorial style of documentation? If that’s the case, that’s totally fair. Man pages are great as a reference, but awful as a way of learning something from the ground up.