1. 42

  2. 6

    This is really interesting, but I am left with all sorts of questions.


    Q1. Where did Sarah2 come from? Who and when, do they have any other work up (eg a website with further or related info)?

    I can’t find anything else on the web other than a reddit thread where it appears the author might be one of the users.

    The linked repo contains a readme for… the hosting site. Still no names anywhere as far as I can see in any of the files.

    Perhaps the author wants to be anon?

    Design decisions: arbitrary or intentional?

    Sarah2 only supports the letters a-z, and the underscore character.

    Sarah2 uses a 27x27 (729 entry) S-box.

    Q2: Is it safe to assume these decisions have been made to keep the S-box small, or are there cryptographic properties to them? ie does increasing the charset/sbox size reduce the effectiveness of the cipher?

    Safety decisions: arbitrary or intentional?

    My recommendation for how many rounds

    Q3: What was the thought process to come up with these numbers? At the moment it reads like they were plucked out of a hat.

    1. 4

      I can’t find anything else on the web other than a reddit thread where it appears the author might be one of the users.

      https://www.reddit.com/r/crypto/comments/ea00yb/sarah2_a_strong_penandpaper_cipher/ – username of submitter matches username of glitch author. Some good discussion in there.

      1. 3

        The linked repo contains a readme for… the hosting site. Still no names anywhere as far as I can see in any of the files.

        It’s hosted on glitch, which I’ve mostly seen used for small, web-embedded games. The “Code for this project” link is a bit of a red herring; the author is just using Glitch as a publishing platform.

        (So yes, they might be wanting to keep anon.)

        1. 1

          …sorry, I was thinking of itch.io

        2. 3

          The minimum number of rounds seems to be based on this statements in the diffusion section:

          The particular shuffle used ensures that even a one letter difference can propagate to the whole ciphertext in as few as log2 rounds.

          Now, it’s interesting that they claim that even if the adversary can perform chosen plaintext attacks, all you need to do is double the number of rounds. If I were such an attacker, I would request the encryption of all 2-character plaintexts; according to the article, those would only be taken through 2 rounds. I suspect that would always be sufficient to crack the key.

          (There’s something here about subsets that are transitively closed under encryption, and even vs. odd length sequences… but it’s too late at night for that, and I seem to have misplaced all memory of my abstract algebra class from university…)

        3. 4

          How do I decrypt a sarah2 encrypted string?

          1. 3

            Unshuffle and unpermute! Each round would look like this:

            • Split the string in two
            • Interleave the two pieces (starting with the first one)
            • Find each pair of characters in the huge table (ugh!) and replace it with its row/column coordinates

            Sounds like much more of a pain in the ass than encryption.

            1. 3

              Thank you so much for the explanation. And I agree: it does sound like a PITA. If the sender hasn’t messed anything up, using just pen and paper, chances are the recipient will.

              And another thing: If sender end recipient can exchange a 27×27 table securely, why would they go to the extreme of encrypting/decrypting a tiny string manually?

              As an academic ort educational excercise this scheme might be useful or interesting, but I have difficulties seeing this being used in ‘real life’.

              1. 2

                Presumably this would be used in a scenario where the two parties can communicate over an existing secure channel (including in person) and then later want to communicate when that channel is unavailable. But there’s still the constraint of having to hang onto a piece of paper with this table and keep it hidden/secure, which really reduces the set of use cases.

                Come to think of it, decryption doesn’t have to be worse than encryption; since the S-box is likely generated by computer, you could just print out a second, inverted S-box to be used for decryption. (And just because one party is doing this by hand doesn’t mean the other party is; the recipient might be using a computer and can even do error correction in case of typos.)

                I don’t foresee it being used in real life either, though.

          2. 3

            This hints at an interesting point: block ciphers cannot be proven secure in the standard model. We can just write them, hope, and see what kind of attacks show up.

            1. 3

              Commenting to create a forward link for anyone who comes across this: cryptanalysis, full plaintext recovery with relatively few known plaintext-ciphertext pairs: https://lobste.rs/s/zjcloa/cryptanalysis_sarah2_pen_paper_cipher

              1. 2

                Interesting, but a way to memorise they key would be good.. One of the advantages with solitaire, is that the key can be a pre-shuffled deck of cards. With memory palace style technique it should even be possible to memorize such a key.

                You could map the combinations to cards… But you’d need 14 decks of cards!

                Ed: i suppose one could find a way to generate s-boxes from some kind of seeded rng, and use (for example) a shuffled deck of cards as input.

                1. 2

                  What about taking a page from a book, and then taking all pairs in order, without repetition? Or remembering a poem, then applying the reduction, and using that – it should be way easier than to remember 729 pair of letters, in order :)

                  1. 2

                    I’m not sure how you mean to get an ehausive pairing from a book or a poem… How do you envision that would work?

                    The idea of encoding a random number as a rhyme of some kind is intriguing - just be aware it won’t be easy to encode a “good” (eg 96+ bit) key that way. 96 bits is a lot of data.

                    A sequence of a deck of cards encodes log2(52!) ~ 225 bits. It could be increased a bit by flipping face up/down, or if it’s possible to read the orientation (upside down, correct). Off-hand I’m not sure how much of an increase that is… I think you’d go from log2(52!) to log2(p((52^3),52)) which is AFAIK equal to each additional bit adding the same n number (ie, apart from rounding errors: 225, 550, 775 bits for sequence, sequence with flipped, sequence with flipped and rotated). But my combinatorics are rusty! (note 225bis rounded down (for entropy), for storing a sequence in ram, you’d have to round up to 226 bits).

                2. 1

                  We need more things like this to show how absurd it is that encryption is considered munitions and export controlled.

                  I hope some (more?) crypto experts take a look at this and see if they can crack and/or improve it.

                  1. 1

                    I don’t think this would be considered particularly strong cryptography—although it’s possible that the use of such a large secret key would be sufficient to protect the case of a small number of messages (or more, but longer messages.)

                  2. 1

                    I’m missing a detail on the construction of the S-box. I’m not at my computer or I’d try to confirm this, but are all of the digraphs unique (no repeats) or are they completely random?

                    1. 3

                      The S-box appears to be just a shuffling of all possible digraphs. The unique order of the outcome of that shuffle is what constitutes the symmetric key.