I really like the idea of taking unix pipes and making them a distributed communication substrate.
A pull does not remove a message from a dnpipes, it merely delivers its content to the consumer.
How does one restart consumption on failure and restart? It is a distributed system, after all.
There also doesn’t seem to be a pushback mechanism, which I personally believe is really fundamental to getting pipes right. But I’m open to being shown I’m wrong. In distributed systems, though, a common failure case is overloading the other side with more work than it can keep up with.
The reference implementation has some oddities too, like how do you send a message with the content RESET? Usually reference implementations are complete and correct but maybe not production ready.
Both points you’re raising (failure and message with content RESET) are very valid points. Looks like I’ve got some more work to do ;)
Ah, and @apy could you elaborate on ‘pushback mechanism’ please?
Presumably he means backpressure.
Yes, backpressure. In Unix pipes if you write data faster than the other side can process you’ll eventually be stuck in a write call or a non-blocking write will say to try again later. Your two options in a distributed system is to either queue data and hope you don’t run out of memory or to have backpressure. The latter is generally considered the more robust solution. If you plan on making a fundamental dist sys component I’d suggest reading through some of the classic literate. Check our @aphyr’s reading list or any of the others that show up in a Google search.
Thanks for the clarification and yes that’s exactly what Kafka is taking care of in my ref impl., maybe I should make this more explicit. Thanks also for pointing out the literature; it happens to be what I’ve been teaching in academic and industrial setups and training courses so I think I’ve got that part covered ;)
No, the goal is not to build a component, the goal is to write a spec, to document a pattern I’ve seen being used in practice, potentially resulting in an RFC.
By component I do not mean a software artifact.
Kafka semantics don’t really match Unix pipe semantics so I’m not sure binding yourself to it makes a lot of sense. Data in Unix pipes is generally transient. Have you looked at what what it’s like to implement message queues with pipes on 9p?
It’s fine if you just want a simpler Kafka client but I’d be weary of calling it “distributed pipes” as that doesn’t really convey the reality to people familiar with conventional pipes.
Sounds like 10% of the functionality of ZeroMQ?