1. 50

Which papers piqued your interest? Though very old, I really enjoyed reading the Dynamo paper from Amazon

  1.  

  2. 11
    1. Hybrid Logical Clocks: By making a reasonable assumption such as “hardware clocks will be periodically synchronized”, this paper introduces a really nice and practical logical clock that can also be used as a physical clock in order to partially order events in a distributed system.
    2. Delta-state CRDTs: This paper provides an optimization over state-based CRDTs, as well as a unifying abstraction allowing for the composition of multiple types of CRDT. There are also some really nice optimizations.
    3. Thicket, a protocol for maintaining multiple spanning trees for broadcast in a P2P system. The protocol uses multiple trees in order to balance the number of times a peer is a non-leaf node of a spanning tree, thereby evenly spreading the cost of forwarding messages.
    1. 4

      Corrected link for Thicket.

      1. 2

        Thank you! Updated :)

    2. 7

      This year I drilled into transaction protocol and consistency papers a bit, spreading out from storage engine concerns that occupied much of 2017 for me. Some of the ones that stood out for me were these:

      I’m suspecting that in 2019 I’ll turn more attention to lock-free programming, replication, correctness testing, formal methods, and maybe more query compilation :]

      1. 6
        1. 6

          I feel like I’m generally looking at families of papers and related code, but if I had to pick a single paper that surprised me, that was well-written, and that I learned a lot from, it might be this “linker speak” paper:

          The missing link: explaining ELF static linking, semantically

          There seems to be a lot of talk about “undefined behavior” in C now, which is where C in practice diverges from the spec.

          But they bring up the fact that linking is nontrivial and it doesn’t even have a written spec, much less a formal spec. I guess this is because it’s very platform-dependent.

          They say this is the first formal specification of linking, in Lem and in Isabelle/HOL, although I don’t know much about that part.

          I was reminded of this paper by Matt Godbolt’s recent CppCon talk, which covers a lot of the same territory, more from an engineering point of view: The Bits Between the Bits: How We Get to main().


          Others:

          • The WebAssembly paper. I’m predisposed to be annoyed by WebAssembly because it’s more complexity, but it’s clear they put a lot of thought into their work, and I learned a lot from how they explained it. And it works and it is deployed. I’m skeptical of how truly language-independent it is, especially because of the upcoming GC, but nonetheless it was worth learning about.

          • Regex derivatives. There is a 2009 paper that revived this topic, and a bunch of code and related projects.

          • Futamura projections, Scala LMS. I have an upcoming blog post where I’ll probably provide proper links for all this stuff.

          1. 4

            I’m skeptical of how truly language-independent it is, especially because of the upcoming GC, but nonetheless it was worth learning about.

            Apparently, it should be seen as supporting languages that have garbage collectors, as opposed to having a garbage collector itself. From HackerNews:

            https://news.ycombinator.com/item?id=18690746

            1. 1

              I can see why it’s phrased that way, but either way, the problem is hard. That is, it will be hard to make every language share the host’s GC (assuming you want to run Python, Ruby, Java, C#, OCaml, etc.). It seems inevitable that some languages and some language implementation techniques will be favored over others in terms of efficiency.

            1. 5

              I have so many conflicted emotions about this as most of my time is spent reading CS papers.

              Do I suggest http://papers.nips.cc/paper/7845-learning-to-infer-graphics-programs-from-hand-drawn-images for sheer coolness factor of learning programs from images?

              Do I suggest https://arxiv.org/abs/1810.12894 for great science and insight into how to properly reward exploration in RL?

              Do I suggest https://arxiv.org/abs/1810.02720 for showing a reliable framework for doing code generation?

              Do I suggest http://papers.nips.cc/paper/8107-improving-neural-program-synthesis-with-inferred-execution-traces for really starting to get at what it means learns the semantics of a program?

              Do I suggest https://arxiv.org/abs/1810.04152 for having a really nice theoretical result with real practical implications?

              Do I suggest https://arxiv.org/abs/1802.02538 for talking about how we are very shitty about diagnosing problems in ML models?

              I feel like I am doing a terrible disservice in describing these papers, but truthfully I loved all these papers and dozens more. The more I read, the harder it is nowadays for a paper to really rock my world and fundamentally change how I think about problems. Their results are all so much more fragile they appear at first glance. Most papers I like for posing problems really well and offering a framework to think about solutions for that problem. I’m often these days less interested in particular solutions as much as a little inspiration of how to solve my own problems.

              1. 5

                Why Philosophers Should Care About Computational Complexity (by Scott Aaronson). It contains some interesting ideas, and it’s written so that you can get something out of it without a deep understanding of either philosophy or CS.

                1. 4

                  Parallelizing Dynamic Programming Through Rank Convergence. I found the longest common sub-sequence example especially compelling, particularly the way that the paper provides a mathematical rigor for the intuition that the large parts of an LCS match could be found in parallel because there’s a bit of “slop” between them if they don’t overlap. There is an error in one of the diagrams, though, that really tripped me up for a bit.

                  1. 3

                    To BLOB or Not To BLOB by the late Jim Gray.

                    1. 3

                      The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

                      https://arxiv.org/pdf/1803.03453.pdf

                      Not very important but incredibly funny.

                      1. 2

                        I found the Blockstack whitepaper quite interesting.

                        1. 2

                          Objects as Closures, which I read every year.

                          1. 3
                            1. 2

                              Also available at no cost from NASA, it seems.

                            2. 2

                              I liked the “Programming Paradigms for Dummies” paper. https://www.info.ucl.ac.be/~pvr/VanRoyChapter.pdf