1. 37
  1.  

  2. 18

    I don’t like the design of Enchive.

    The process for encrypting a file:

    1. Generate an ephemeral 256-bit Curve25519 key pair.
    2. Perform a Curve25519 Diffie-Hellman key exchange with the master key to produce a shared secret.

    OK.

    1. SHA-256 hash the shared secret to generate a 64-bit IV.

    Kinda OK, can justify this complexity by the need for a quick check before decryption (“validate the IV against the shared secret hash and format version”) if we got the correct key.

    1. Add the format number to the first byte of the IV.

    OK.

    1. Initialize ChaCha20 with the shared secret as the key.

    This is using raw multiplication result as a key. It’s recommended to hash the result (but not pure SHA256 as we’re already exposing 56 bits of it as IV) before using is as a cipher key (for example, NaCl uses HSalsa20 as a quick hash for that).

    1. Write the 8-byte IV.
    2. Write the 32-byte ephemeral public key.
    3. Encrypt the file with ChaCha20 and write the ciphertext.

    OK. But for big files, it may be worth using chunked authenticated encryption to avoid spilling out unauthenticated plaintext or wasting time (see https://www.imperialviolet.org/2014/06/27/streamingencryption.html and my implementation https://github.com/dchest/nacl-stream-js).

    1. Write HMAC(key, plaintext).

    Here we have three problems.

    First is that is uses the same key for HMAC as for encryption. I don’t think there’s a particular interaction problem between HMAC-SHA-256 and ChaCha20 that would lead to something scary, but this design is not ideal. To fix this and previous issue in one shot, the authors could use a 64-byte hash function to derive both encryption and authentication keys from Curve25519 shared key: encr_key || mac_key = SHA512(shared_key), or use HMAC-SHA256 with different personalization strings (encr_key = HMAC-SHA256(“EncrKey”, shared_key) and mac_key = HMAC-SHA256(“AuthKey”, shared_key), or HKDF.

    Secondly, it’s MAC-then-encrypt, which exposes cipher to various attacks before there’s a chance of authenticating. Finally, I would also authenticate everything, not just the ciphertext. So I’d use HMAC(mac_key, everything) where everything is IV, ephemeral public key, and ciphertext. This way, HMAC will be checked before decrypting, and malicious payload will be rejected early.

    Enchive uses an scrypt-like algorithm for key derivation, requiring a large buffer of random access memory.

    If it’s scrypt-like, why not just use scrypt? I haven’t checked the whole algorithm, but I can already see a drawback: it uses SHA-256 to perform work on memory. Scrypt specifically uses a very fast function (8-round Salsa20) so that it can perform this computation as quickly as possible, which is very important for a memory-hard function.


    To summarize: there’s nothing particularly broken with this design, as far as I can tell from a quick look, but it’s not a solid design, unfortunately.

    1. 5

      Enchive’s author here. These are all good points. Most of the mistakes are me not knowing any better when I designed it, but, fortunately, none of them fatal as far as I know.

      But for big files, it may be worth using chunked authenticated encryption to avoid spilling out unauthenticated plaintext

      I did eventually figure out chunked authentication for myself months later, but too late for Enchive. If I ever redesign the file format, it would definitely use chunked authentication, among other corrections like using EtM.

      If it’s scrypt-like, why not just use scrypt?

      At the time (early 2017) I couldn’t find a drop-in scrypt library with a friendly license, and I didn’t want to try implementing it myself. A major design goal was ANSI C and no dependencies. As a result, Enchive can easily be compiled just about anywhere, probably even decades into the future (to, say, decrypt some old archives). As evidence of this, you can build it and run it on Windows 98 decades in the past.

      1. 5

        I get the feeling most of those shortcomings are caused by direct use of primitives. I suspect that the author was trying to:

        1. minimize dependencies – especially looking at optparse.h, which is (mostly) redundant on a POSIX system due to getopt(3) existing – and source files, and
        2. keep the license unencumbered (all third party code seems to be in the public domain:, but then ended up making dangerous decisions given raw primitives.

        argon2 not being in there is probably not an accident but a result of how difficult it is to implement and how he’d have two hash functions (SHA-256 and BLAKE2 for the argon2 state).

        The author might’ve had a better result and less work with naive use of Monocypher, libsodium or TweetNaCl, though TweetNaCl still would’ve let him shoot himself in the foot with raw X25519.

        1. 1

          If it’s scrypt-like, why not just use scrypt?

          Yeah, it’s like they’re not aware that scrypt comes with a file encryption utility.

          1. 3

            I didn’t mean using the file encryption utility itself, but the KDF primitive. Although, indeed, the scrypt utility is great (I use it for my files), but it doesn’t do asymmetrical encryption, which seems to be the point of Enchive.

            1. 1

              but it doesn’t do asymmetrical encryption, which seems to be the point of Enchive.

              Ah, I missed that part. Hmm, well in that case Enchive seems pretty alright as far as goals are concerned. Hopefully the author will incorporate your suggestions.

        2. 4

          Yeah I think I’ll stick with GPG for now.

          1. 3

            I just recently ran across Sequoia-PGP - which doesn’t answer the original posts concerns, but hopefully will become popular enough to encourage more thinking about the UX/concerns in the post.

            1. 3

              One nit: GPG 2.1+, I think, actually does start gpg-agent automatically on demand. But the main reason it does that is because for some unknown reason the GPG people decided to move most of the system’s functionality out of the gpg binary and into like 5 daemons that you now have to have running all the time and muddy the whole thing up. Why the old system was inadequate other than the fact that it was old and not shiny and overengineered is beyond me. (If someone knows, I would love to find out.)

              That all being said I am very happy to see this experiment. PGP is awful. I’d love to see it finally die.

              1. 10

                GPG contains lots of engineering effort to make sure that keys are not accidentally leaked to swap, and that applications using gpg under the hood have a safe method of doing so.

                I haven’t seen the GPG threat model fully documented but it’s definitely much more involved than Enchive’s.

                1. 5

                  They split the program into communicating parts to help isolate the address spaces of the executables.

                  See the section of Neal’s talk starting at 0:31:45 - https://begriffs.com/posts/2016-11-05-advanced-intro-gnupg.html

                  1. 1

                    I finally made time to look at this, thanks! AFAICT there are two reasons he gave (I watched until about 40 minutes left in the video; the dumb player UI won’t show me how far in that is):

                    1. Each component has a separate address space
                    2. Future window managers with “trusted windows” could treat pinentry specially, “somehow”, because it’s forked from gpg-agent

                    Honestly I don’t see why either of those things couldn’t just be accomplished with the exact same architecture except using regular subprocesses instead of daemons. Can anyone give a reason that isn’t the case?

                    Reason 2 in particular seems like a lot of engineering to support a vaguely defined future scenario which may or may not show up, ever, and certainly does not exist now.

                  2. 2

                    They broke a whole bunch things when they moved to that new architecture and much if it was never fixed. It’s one of the reasons I stopped using GnuPG directly.

                  3. 1

                    Is there any well known PGP alternative other than this? Based from history, I cannot blindly trust code written by one human being and that is not battle tested.

                    In any case, props to them for trying to start something. PGP does need to die.

                    1. 7

                      a while ago i found http://minilock.io/ which sounds interesting as pgp alternative. i don’t have used it myself though.

                      1. 2

                        Its primitives and an executable model were also formally verified by Galois using their SAW tool. Quite interesting.

                      2. 6

                        This is mostly a remix, in that the primitives are copied from other software packages. It’s also designed to be run under very boring conditions: running locally on your laptop, encrypting files that you control, in a manual fashion (an attacker can’t submit 2^## plaintexts and observe the results), etc.

                        Not saying you shouldn’t be ever skeptical about new crypto code, but there is a big difference between this and hobbyist TLS server implementations.

                        1. 5

                          I’m Enchive’s author. You’ve very accurately captured the situation. I didn’t write any of the crypto primitives. Those parts are mature, popular implementations taken from elsewhere. Enchive is mostly about gluing those libraries together with a user interface.

                          I was (and, to some extent, still am) nervous about Enchive’s message construction. Unlike the primitives, it doesn’t come from an external source, and it was the first time I’ve ever designed something like that. It’s easy to screw up. Having learned a lot since then, if I was designing it today, I’d do it differently.

                          As you pointed out, Enchive only runs in the most boring circumstances. This allows for a large margin of error. I’ve intentionally oriented Enchive around this boring, offline archive encryption.

                          I’d love if someone smarter and more knowledgeable than me had written a similar tool — e.g. a cleanly implemented, asymmetric archive encryption tool with passphrase-generated keys. I’d just use that instead. But, since that doesn’t exist (as far as I know), I had to do it myself. Plus I’ve become very dissatisfied with the direction GnuPG has taken, and my confidence in it has dropped.

                          1. 2

                            I didn’t write any of the crypto primitives

                            that’s not 100% true, I think you invented the KDF.

                            1. 1

                              I did invent the KDF, but it’s nothing more than SHA256 applied over and over on random positions of a large buffer, not really a new primitive.

                        2. 6

                          Keybase? Kinda?…

                          1. 4

                            It always bothers me when I see the update say it needs over 80 megabytes for something doing crypto. Maybe no problems will show up that leak keys or cause a compromise. That’s a lot of binary, though. I wasn’t giving it my main keypair either. So, I still use GPG to encrypt/decrypt text or zip files I send over untrusted mediums. I use Keybase mostly for extra verification of other people and/or its chat feature.

                          2. 2

                            Something based on nacl/libsodium, in a similar vein to signify, would be pretty nice. asignify does apparently use asymmetric encryption via cryptobox, but I believe it is also written/maintained by one person currently.

                            1. 1

                              https://github.com/stealth/opmsg is a possible alternative.

                              Then there was Tedu’s reop experiment: https://www.tedunangst.com/flak/post/reop