1. 45
  1.  

  2. 43

    I would guess that a very substantial proportion of the people who read Lobsters have heard of Timsort.

    1. 22

      That, and TimSort really only makes sense in certain situations:

      • Comparisons have to be extremely expensive. Think dynamic dispatch. This is because TimSort itself performs a bunch of branches and integer comparisons to keep track of galloping scores and the stack invariant.

      • You need a stable sort. If you don’t need a stable sort, pattern defeating quicksort will probably do better.

      1. 11

        Quicksort is the Achilles of sorting algorithms: unbeatably fast, easy to implement, in-place; but with the vulnerable heel of bad worst-case performance (the worst case being pre-sorted data in the naïve implementation) and instability.

        1. 5

          There’s a fairly easy fix to that, called introsort: start with quicksort, but bail out to a guaranteed O(n log n) sort like heapsort if it takes too long. In the bail-out case, you lose constant-factor performance compared to if you had used heapsort in the first place, but you avoid quicksort’s O(n^2) worst case, while still getting its good performance in non-pathological cases. It’s used in practice in .NET and some C++ STL implementations.

          1. 3

            Quicksort -> Heapsort is method I used. It worked fine in practice. I love solutions like that. Another, unrelated one was sklogic’s trick of using a fast, dumb parser first to see if it’s correct. If it wasn’t, he switched to one that made error messages easier.

            I bet there’s more of this stuff waiting to be found for situations where people are shoving every case into one algorithm.

        2. 4

          Comparisons have to be extremely expensive. Think dynamic dispatch.

          That explains why Python uses it as it’s standard sort.

          1. 9

            Yeah. That’s exactly why Python uses TimSort.

            More tellingly, where Rust uses an algorithm that’s related to TimSort for its stable sorting algorithm, they didn’t implement “galloping” because it’s not worth it. https://github.com/rust-lang/rust/blob/7130fc54e05e247f93c7ecc2d10f56b314c97831/src/liballoc/slice.rs#L917

        3. 10

          I consider myself relatively knowledgeable about many different topics of CS and had not heard of Timsort until this article. What’s the point of your comment? That the article is not worth posting as you presume that it is widely known?

          1. 4

            The point is that the title of the article, and of this submission, is inaccurate. I would call the title clickbait because for most readers, the article doesn’t deliver what it promises – a sorting algorithm “you’ve” never heard of. I think the article itself is fine; it’s just the title that is a lie.

            1. 5

              That seems to be a really low-value comment. For whom is the remark actually intended? For other crustaceans to look at, nod and agree, thinking, “yes, I too possess the superior knowledge”? Does every submission with a title that invokes “you” need to be correct and ‘deliver on its promise’ for all possible “you”s? C’mon.

              1. 4

                Yes, I suppose jgb could have been more explicit in why they brought up their guess. (I wrote an explanation of my interpretation of the comment, not the original comment.)

                Does every submission with a title that invokes “you” need to be correct and ‘deliver on its promise’ for all possible “you”s?

                I think every article with a title that invokes “you” needs to be correct and ‘deliver on its promise’ for the majority of possible “you”s in its audience. If a title says “you’ll love this” and most readers don’t love it, the title was wrong, and it wasted people’s time by getting them to open the article on false pretenses. It is up to article authors to adhere to that principle or not.

                As for titles of submissions of articles with clickbait titles, there can be a conflict between submission titles that reflect the author’s intent and titles that accurately describe the article. I don’t have a simple answer as to when submission titles should differ from the article title.

                1. 3

                  I think every article with a title that invokes “you” needs to be correct and ‘deliver on its promise’ for the majority of possible “you”s in its audience.

                  I think I agree with this, and I think my concern comes down to disagreeing instead with the notion that the majority(/“a very substantial proportion”) of Lobsters readers have heard of Timsort. Short of a poll there’s not an actual answer to that; I just felt particularly rankled because I hadn’t, and presumably if I had I wouldn’t have bothered or thought to comment myself.

                  I err on the side of preserving the article title in the submission, which I think is pretty common. Accordingly, I think most Lobsters are primed to see submission titles that aren’t necessarily addressing them as Lobsters readers, but in some context that might be quite removed.

          2. 2

            I thought it played a pretty big role in the Oracle vs. Google lawsuit too, making it one of the more famous algorihtms.

            However see “rangeCheck” mentioned a lot, which is a trivial part of TimSort.

            https://en.wikipedia.org/wiki/Oracle_America,_Inc._v._Google,_Inc.

            Here it seems to cite TimSort. But for some reason I can’t find a lot of sources that talk about TimSort and the lawsuit, even though at the time I remember it being a prominent thing.

            https://forums.appleinsider.com/discussion/149435/google-engineers-defend-source-code-email-in-oracle-lawsuit-over-java/p4


            edit: here’s another one mentioning TimSort and the lawsuit.

            https://majadhondt.wordpress.com/2012/05/16/googles-9-lines/

            Googling “rangeCheck timsort lawsuit” digs up some results.

          3. 10

            TimSort is non trivial! Python and Java implementations of TimSort were for some time incorrect, but I believe this had been fixed. A detailed post that describes this is here: http://www.envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

            1. 9

              I can’t really understand why “not constructed in academia” is right up there at the top of the list. Is academic research being held back by some kind of Knuthist orthodoxy?

              1. 2

                Most academic research is impractical in some shape for form. Some of it is good. I always distinguish between the two to be fair to those building useful or potentially-useful stuff. You could actually corroborate this with just one example that boosted productivity worldwide, paying for itself and many others in the big picture. It started there followed by what the business did.

                RISC CPU’s, ARM and MIPS empires, came out of academia, too. MATLAB was started by researchers. Coverity started as a way to make compilers do checks on code. In formal verification, CompCert compiler sold commercially started as academic research. Note that I’m excluding research by companies since I think that might get dismissed as more productized focus. Just folks at research organizations whose research proved enormously useful. In CompCert’s case, more practical than most thought it would get.

              2. 1

                For a non-trivial and practical sorting algorithm you probably havent heard of, I found this CppCon talk by Andrei Alexandrescu incredibly interesting.

                https://www.youtube.com/watch?v=FJJTYQYB1JQ