1. 12

Abstract: “The Fisher–Yates shuffle is an algorithm for generating a random permutation of a finite sequence—in plain terms, the algorithm shuffles the sequence. The algorithm effectively puts all the elements into a hat; it continually determines the next element by randomly drawing an element from the hat until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely. The modern version of the algorithm is efficient: it takes time proportional to the number of items being shuffled and shuffles them in place. “

    1. 7

      Don’t forget to read the ‘Potential sources of bias’ section. It’s easy to implement this wrong.

      1. 1

        44497 prng seed bits needed to properly shuffle a list of only 4199 items… No wonder that this function is absent from so many libraries.

        1. 2

          That’s the limit though, you need 44 488 bits to fully specify a permutation of 4199 items. It’s nothing to do with this algorithm in particular.

    2. 6

      I usually turn to this blog post when I need a reference for the Fisher-Yates shuffle.

      It’s a good algorithm to have on hand, whether you’re shuffling a data structure or just indices into it. I’ve found a couple really elusive bugs by running tests shuffled multiple ways.

    3. 4

      Something that has been annoying me for a while now: I feel like there is this intuitive connection between Fisher–Yates and reservoir sampling, but I haven’t been able to see it yet.

      To sample k items from a finite, known, sequence, you can very efficiently just do the first k iterations of a Fisher–Yates shuffle and take the k first elements.

      If you want to sample from a (potentially infinite) stream, you can do reservoir sampling, where you in principle, for the ith element observed, randomly select a location j in an imaginary sequence of size i, and place that element there. Then once the stream is consumed to a satisfactory degree, you return the first k elements from this imaginary i-sized sequence. (Obviously in practice you just throw away any element that lands on a position j > k, since you’ll never see it anyway.)

      Maybe I’m looking for a similarity when in fact there’s some sort of duality going on? I’m not sure.

      1. 1

        I’m not sure if there is some kind of equality or duality going on. I recently learned about reservoir sampling (with a reservoir of size 1, so essentially randomly selecting an element from a stream) from Jeremy Kun’s site, and noticed some parallels as well. Here’s my naive analysis (naive, because I don’t have any particularly interesting observations).

        Both methods work inductively: first they work on an array/stream of size 1. That is, Fisher-Yates randomly permutes the elements (of the complete array) over an array of size 1. Reservoir sampling picks a uniformly random element from the stream of size 1. Further, both methods use uniform sampling, where each element has a probability of 1/n to be picked.

        Of course, the most notable difference is that Fisher-Yates operates on a stream of known size, and reservoir sampling on a stream of unknown size.

        1. 2

          Of course, the most notable difference is that Fisher-Yates operates on a stream of known size, and reservoir sampling on a stream of unknown size.

          Or, in other words,

          • Fisher–Yates iterates deterministically through all positions of the k samples returned, and selects elements for them by randomly selecting among the items not yet included in the sample.

          • Reservoir sampling deterministically iterates through all items not included in the sample and selects the position for them by random selection.

          So in that sense, they are doing opposite things in a way…

    4. 2
    5. 2

      Handy to know, certainly. Definitely in the category of “I could come up with something like this, but probably with a few flaws, so might as well just use the version that’s had all the kinks worked out.”

    6. 2

      Is someone aware of an efficient shuffling algorithm that is biased for something like shuffling playlists? I basically want to put tracks into nested buckets (artist -> album) then shuffle each bucket and distribute each track as far as possible. It should then distribute the tracks from the same artist as far as possible and also distribute tracks from the same album as far as possible.

      There are implementations, but I would like to know if there is something better.

      Spotify used to do Fisher-Yates shuffle but switched to an algorithm like this.

      http://keyj.emphy.de/balanced-shuffle/ https://labs.spotify.com/2014/02/28/how-to-shuffle-songs/ https://cjohansen.no/a-better-playlist-shuffle-with-golang/

      1. 2

        Spontaneous thought: Fisher–Yates picks the next element uniformity in the range [i,n] – why could you not pick using another distribution defined by similarity to the previously locked song? With dynamic programming you should be able to get a reasonably efficient and optimal solution.

      2. 2

        I think that’s pretty much state of the art. You can get better results in some cases by starting with a larger target array. Say 4x the number of inputs. This reduces the number of collisions.

        I didn’t fully follow the last example, but I think it’s possible to do better than trying to pile the first track into index 0 every time.

    7. 0

      I got Mesh from Hacker News. Someone said their shuffle vector looks like Fisher-Yates. Figured Fisher-Yates was worth its own submission given interest here in random numbers and collections.