Hm interesting, I looked at the gg source code after seeing your initial post about it. But I didn’t catch that it had its own HTTP client with pipelining! I remember reading about HTTP pipelining a long time ago in a book, but never actually worked with it.
gg is indeed an impressive piece of engineering! The one thing I didn’t quite like was that the “model substitution” seemed kind of laborious and error-prone, e.g:
It seems like they have to understand every flag that the compiler itself does, and open the same files in each case so it can be traced correctly? If not then the remote worker would be missing dependencies (e.g. if you pass -fsanitize=address, then you need the ASAN runtime so you can link it, etc.).
At least that was my impression about how it worked.
I had missed the llama post but it looks promising! xargs -P is pretty much my favorite shell command :)
I’d be interested to hear any details on what you learned about the Lambda API and runtime, e.g. the zip files vs. Docker images, kernel, etc.
gg being portable across clouds (lambda google cloud) is very interesting to me, but the paper left out a lot of the engineering details in making that happen! Though I just peeked at the lambda engine and it looks pretty short and reasonable: https://github.com/StanfordSNR/gg/blob/master/src/execution/engine_lambda.hh
I guess most people use a big SDK but they rolled their own C++ !
I tried pipelining HTTP POST requests (*) once for a very specific use case: trying to beat gsutil -m at deleting large numbers of files from GCP bucket that was full of garbage.Unfortunately the backend of GCP seems to be pretty slow, and from what I saw it did not read the second request from the wire until the first was already replied to so in my case it didn’t help any.
(* AIUI the RFCs recommend against, if not ought forbid, pipelining requests that aren’t read-only, but eh sometimes it works and is useful for a specific use case.)