1. 32

Or some of the best papers.

  1.  

  2. 10

    Ready for the most mindblowing paper I’ve read this year?

    Fixing Faults in C and Java Source Code: Abbreviated vs. Full-word Identifier Names

    Yeah, I wasn’t expecting that either. But I love this paper so much, because of just how good it is. They preview prior literature in depth, their experiment design rules out every confounding factor you can think of, and they follow up with an ethnography of how programmers find defects in abbreviated codebases. I think everyone should read it, just to know what a good experimental SE study looks like.

    1. 2

      Skimming the paper, I didn’t notice any kind of power analysis. The group sizes in Table V are very small. So the claim from the abstract, “Overall results suggested that there is no difference in terms of effort, effectiveness, and efficiency to fix faults, when source code contains either only abbreviated or only full-word identifier names” seems like over-reach.

      I will still happily cite this paper next time someone complains about me using single-letter variable names to reduce line-wrapping and lines of code, though. :-)

      1. 1

        I also read that one, +1 for a great work

      2. 6

        I would say Stephen Kell’s work on the “lurking Smalltalk within Unix” has been on my mind the most.

        His papers are full of great information and thoughtful arguments (full list on his personal site). At times, they can be dense, and they might take a few readings to absorb, but that’s true of anything worth reading. It’s all “systems” research and not particularly mathematical, so there shouldn’t be any fundamental barriers.

        I think it is worth comparing Unix to SmallTalk/Lisp. The latter systems definitely have a “purity” to them, but people sometimes overlook that Unix can do the same things. His work on liballocs is about reflection over C data types in Unix processes. It has some philosophical similarity to DTrace.

        FWIW it also reminds me of one of my blog posts, where someone suggested a Lisp-ish solution, and my response is that “Unix can already do that”. [1]

        I think what I like about his work is that it’s inherently “conservative”, trying to understand what existing systems already do – which is surprising in many cases! I think a lot of papers are suggesting new things without adequately understanding what practitioners already do.

        http://www.cl.cam.ac.uk/~srk31/#onward15

        http://www.cl.cam.ac.uk/~srk31/research/papers/kell15towards-preprint.pdf

        Videos:

        https://www.youtube.com/watch?v=saIFAQdxD-U&feature=youtu.be

        https://www.youtube.com/watch?v=LwicN2u6Dro

        [1] http://www.oilshell.org/blog/2017/01/13.html

        1. 3

          Stephen Kell is doing some really interesting stuff, but seems to be largely overlooked. I posted the video accompanying the paper here a bit ago, glad to see another “fan” (if that is the right word).

          1. 1

            Totally out of step so ignored.

        2. 4

          My favorite read this year is a classic: C.A.R. Hoare’s Hints on Programming Language Design (1973). I was shocked to find that everything discussed remains highly relevant in modern language design 44 years later.

          1. 4

            Also, I recommend this video that explains a classic paper:

            John Myles White on “Fundamental Concepts in Programming Languages” by Strachey:

            https://www.youtube.com/watch?v=cO41uoi5cZs

            Summary: there’s a lot of stuff we take for granted around the ideas of values and references in programming languages.


            If you don’t know about the ARC caching policy, it has the notable distinction of being both relatively recent and also highly deployed. (Usually CS concepts take a long time to be deployed, and I trust deployed concepts more than undeployed ones). This explanation by Bryan Cantrill is nice and entertaining:

            https://www.youtube.com/watch?v=F8sZRBdmqc0

            1. 1

              Can you cite a source on the high deployment of ARC? https://en.wikipedia.org/wiki/Adaptive_replacement_cache#Deployment seems pretty sparse.

            2. 3

              Using Coq to Write Fast and Correct Haskell” John Wiegley & Benjamin Delaware

              A Tale of Two Provers” Niki Vazou, Leonidas Lampropoulos & Jeff Polakow

              1. 2

                For me, it’s probably “Will Serverless End the Dominance of Linux in the Cloud?” by Ricardo Koller and Dan Williams. It’s an interesting take on what the impact of serverless computing paradigm has on the OS (especially on Linux that dominates the cloud).

                1. 2

                  Algebraic Graphs with Class by Mokhov. See my blog article which has a reference to this paper. http://www.hansdieterhiep.nl/blog/graphs-in-haskell/

                  1. 1

                    I admit I didn’t read any.

                    1. 1

                      On the emergence of invariance and disentangling in deep representations

                      we show that in a deep neural network invariance to nuisance factors is equivalent to information minimality of the learned representation, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. We then show that, in order to avoid memorization, we need to limit the quantity of information stored in the weights, which leads to a novel usage of the Information Bottleneck Lagrangian on the weights as a learning criterion

                      1. 1

                        Objects as closures: abstract semantics of object-oriented languages ACM DL entry, PDF at NASA.gov.

                        We discuss denotational semantics of object-oriented languages, using the concept of closure widely used in (semi) functional programming to encapsulate side effects. It is shown that this denotational framework is adequate to explain classes, instantiation, and inheritance in the style of Simula as well as SMALLTALK-80. This framework is then compared with that of Kamin, in his recent denotational definition of SMALLTALK-80, and the implications of the differences between the two approaches are discussed.

                        I read this back in 2015, and went through a mind-bending “OOP is FP, FP is OOP” paradigm shift. I gave a talk about it back then. I re-read the paper this year (making it on-topic for the question) because it’s just so concise, mind-expanding and enjoyable. This time I took more careful notes.