1. 20

Does anyone here have experience using mutual TLS in place of api keys for authenticating clients to apis?

If so I’d love to hear about benefits, pitfalls, software to support managing the client certs, and any other resources.

I assume this would work by the api owner issuing a certificate in exactly the same way as an api key, and distributing root cert public keys and revocations to the machine(s) that do tls termination?

  1. 10

    (note: for context, my experience with this is entirely service-to-service related. Your question is ambiguous about the specific use-case)

    Cert management, renewal, and revocation is a HARD problem. So I would suggest mTLS only in exceptionally simple use-cases or exceptionally complex cases. But it’s great for those use-cases.

    For very simple cases, you can just run your own CA using something like this guide and essentially manage the certs by hand. This will only really work for a dozen or so services before it becomes really onerous.

    On the other end of the spectrum, we use a service mesh architecture for Kubernetes called Istio that performs this duty. It manages mTLS certs between services so they can talk to each-other and we don’t have to manage any of that ourselves. This works by basically running a sidecar in every namespace that tunnels traffic and transparently manages the mTLS auth for you.

    If you were really ambitious, you could potentially achieve something similar with stunnel deployed with custom mTLS certs, but it would require some Ops magic.

    1. 1

      Thanks! Specifically I’m thinking of using this for my third party clients to authenticate to my service.

    2. 6

      My work does this at scale today!

      In general, that’s pretty much how it works. We have a long-lived root in a vault that signs multiple leaf certs. The current leaf is in an environment that has some automation to sign client certificates. I think we run a OCSP responder to handle revocations, not positive, but we terminate most of those connections at a single box so it’s not a huge deal either way.

      The biggest downside is teaching people on the other end how to use it. The tooling/dev experience around mTLS/client-certificates is not great. We offer both flavors – mTLS & API keys – and are finding API keys are the most preferred solution.

      1. 4

        If the use case requires client-auth (since you speak of mTLS), then API keys may not be the right approach to begin with. If the use case does not require client-auth, then mTLS may not be the right approach to begin with. :-)

        In general, mTLS requires PKI-like infrastructure, which is difficult to manage, operate and scale. Fortunes have been made trying to solve this problem. As compared, an API-key infrastructure is simpler, so you should also factor this aspect in your decision.

        1. 4

          Not sure if related, but the Gemini protocol makes use of TLS certificates to handle authentication.

          1. 3

            It’s authentication, Jim, but not as we know it. TLS certs are more a replacement for cookies in Gemini than anything else. It’s because each cert is self-signed so there’s not real way to securely authenticate a certificate.

            Most clients allow a user to create a TLS keypair (they’re called identities in Lagrange) that can then be used as an identity for a specific gemsite, like for commenting or uploading.

            1. 1

              Thank you!

            2. 2

              At my company, we’re a service provider doing exactly this. We don’t have many customers (type of niche business), so what we do is we ask customers to send us a client certificate that we’re adding to a list of trusted certificates.

              The idea is that our front server doing TLS termination will not trust any CA, only the client certificate that we decided to trust.

              This has several benefits:

              • customers can use their usual PKI to manage the certificate. They know what it is used for (Key Usages, Extended Key Usages, etc..), they pick the fields etc… we just enforce a list of ciphers for security (eg no RSA<2048, no SHA1, …)
              • customers have the duty to renew their certificate, since it is part of their PKI, they have to manage it. So periodically we get new certificates to swap.
              • we don’t have to manage any PKI ourselves, no revocation list, no HSM, etc… when we get audit we can skip the questions about Certificate Practice Statements etc… we don’t have to explain our process regarding key renewal for customers etc…
              • it’s is super easy ! We explain to customers there’s a batch job to add the certificate, outside business hours, to avoid any disruption and done.

              Some security companies pointed at it saying “that’s not secure” but they dropped it from the reports when we asked them to explain what was the threat faced. If someone knows about it… happy to learn!

              If you want to look at mTLS for service-to-service I advise to look at step-ca, which is amazing to do that.

              1. 2

                We had a pretty massively distributed network of clients speaking with a subset of APIs. We did a somewhat quick and dirty version of this because it was entirely in-network and Java removed anonymous ciphers by default. On startup, each API would create and sign its own certificate which would then be used for communication between client and API.

                This doesn’t solve the mutual problem, as clients themselves aren’t “trusted”. but it may give you some ideas for implementation if cost is a concern.

                I also did look a bit into the Noise Protocol as an alternative to TLS, but went with the above approach due to time constraints.

                1. 2

                  I work on SPIFFE, which does exactly this.

                  Honestly it is a lot of work to set up. But we have some huge users running it on millions of machines.

                  1. 1

                    I like client certs, but in my experience the physical smart cards (to store the x509 &c) are the weak point of failure. Which reminds me, one of my credit cards’ chips seems to be failing. Doing it wholly-software is also attractive, but I just haven’t done it.

                    I’m fine with having 2 cards (1 user + 1 admin), so having backups (or maybe just better hardware) would be great.

                    1. 1

                      Phones and other client devices can store the carts and private keys securely too. Not 100% sure of all platforms, but iOS and macOS do.

                    2. 1

                      I assume this would work by the api owner issuing a certificate in exactly the same way as an api key, and distributing root cert public keys and revocations to the machine(s) that do tls termination?

                      That’s one (and probably the most common) way to do it. It comes with the drawback of having your CA’s private key somehow exposed to the internet unless your usecase allows doing it completely air-gapped (e.g. the amount of certificates is rather low and/or users are prepared to tolerate some delay).

                      However, you can also turn it around: Instead of you signing and handing out client certificates, make your users generate self-signed certificates which they then hand to you via an already authenticated channel like an account admin UI. This has the advantage of freeing you from having to maintain a CA and all the operational headaches that entails. The drawback is that you need some way to dynamically provision certificates to your TLS server (if that happens often you don’t want to be forced to restart it everytime). Also, the server needs to be prepared to handle a large number of trusted certificates since each certificate is signed with a different key. If that triggers a linear scan or so on every connection attempt, that could turn into a DoS vector.

                      Note that what you end up with is conceptually similar to how SSH client authentication works: Users generate their own key pair and via some authenticated channel put the fingerprint of their public key into an ~/.authorized_keys file.

                      1. 2

                        the fingerprint of their public key

                        Err, it’s the public key itself that one puts there, of course.

                      2. 1

                        This is almost exactly what tailscale does, except the TLS is opaque to the application.

                        Traffic arriving on the tailscale network interface is authenticated (the source IP address uniquely identifies the client and can’t be forged).

                        1. 1

                          I would recommend looking at some kind of a acme protocol. This would enable automated certificates from both sides. There’s some very nice stuff that does this from the stepca folks. I made a very nice raspberry pie certificate server with random number generator for well under $100. You can do this on both ends and have a certified b and b certify a and would automatically do it for you.

                          1. 2

                            Unfortunately, I don’t believe ACME supports doing client certs. A server cert can be automated because the domain is the Common Name, but with clients that is almost certainly not the case.

                            This sort of thing would be done using an Identity Provider of some kind.

                            1. 1

                              there’s also dogtag, often used with freeipa, which I think can be used with an API