Great write-up, I like how you get into the details and give context to the design decisions. It makes a lot of sense, both conceptually and architecturally.
i’ve always wondered if one could use gpus to speed up prolog?
The way prolog is written in practice tends to lean pretty heavily on the order in which things are evaluated – you make sure predicates fail fast, and to do that, you take advantage of the knowledge that the order in which predicates are defined is the order in which they are evaluated (and then you use cuts to prevent other paths from being evaluated at all). A lot of real code would fail if it got implicitly parallelized in any way. (This is one of the reasons I didn’t want to keep compatibility.)
It’s pretty trivial to make a prolog-like language that implicitly parallelizes most back-tracking. (In fact, that’s most of what this is.) But, when used naturally, it would cease to have the same kind of operation ordering guarantees. (You could backtrack on the results after parallel execution, in order to figure out which path should have run first, but there aren’t enough prolog programmers to justify keeping compatibility IMO.)
I’m not sure GPUs would be more beneficial than having extra conventional CPU threads, since what we’re doing is basically a search. However, maybe there’s some way to model a tree search as matrix multiplication at compile time or something. (I don’t really have enough of a math background to say. They managed to do it with neural nets, so I guess anything’s possible.)
thanks for the reply! i don’t really now much about prolog, but when i last was doing some cuda stuff i thought about this. i didn’t know that the evaluation order is used that much in practice.
maybe tree searches could be somewhat parallelized when stored like a closure table for sql, but that’s a wild uneducated guess :)