1. 38
  1. 22

    So, one of the companies that ruined the web with all those CAPTCHAs is going to save us from them now? Gee, I’m so excited! /sarcasm

    The aim of this project: we want to know that you’re human. But we’re not interested in which human you are.

    I am a WebAuthn ignoramus. How does this even work? The article talks about collecting a signature based on my private key. It must know my public key in order to verify it. That seems to make it so much easier for CloudFlare and online ad brokers to build a profile for me and track me across the web.

    1. 5

      The mechanism described uses registration. This way, each time a new unique key-pair is created, a challenge gets signed with the public key, and the public key gets signed with the attestation key common with a large amount of tokens (at least 100,000 recommended) that is stored in the token. The server side then checks if the challenge was signed, and if the attestation signature is correct, and matches one of the approved keys. The way to track people would be to give out authentication requests instead of registration, but that requires passing KeyID’s (which are basically a carriage to store encrypted public keys on the server instead of the token to keep costs down), which would not scale at Cloudflare’s level as it needs to know what KeyID’s to pass, and to know that it usually needs to know your identity(at least roughly) which if it already does serves no purpose. All in all this seems a nice hack to use Webauthn to filter out bots, even if almost as annoying as CAPTCHAs.

    2. 12

      Cloudflare trusts the root certificate of manufacturers. Because their numbers are limited, we have the capacity to verify them manually.

      That means that no free-software/free-hardware/bring-your-own-keys tokens will be trusted.

      1. 3

        Keys manufactured by Solokeys might get supported in the future, but likely only those manufactured by them and using their signed firmware.

      2. 5

        This doesn’t seem to motivate why having a security key proves that you’re human. What prevents people from just automating this interaction?

        1. 14

          “Prove you are human” has always been the sort of marketing spin on captchas; it’s about making automation marginally more expensive than it’s worth.

          1. 13

            In the case of Google, I think it is also: get image recognition training data for free. Most of their image captchas are clearly image recognition for automotive. I strongly suspect that a subset of tiles that they serve are unannotated and they will then use annotations for which there is high agreement.

            1. 2

              If you click wrong, someone could die. It isn’t just a captcha.

              1. 4

                First, people are incentivized to click the right tiles, since they want to bypass the captcha.

                Second, they would not base the label on a single annotation, but rather on thousands or even tens of thousands of annotations which have a high level of inter-annotator agreement.

                1. 7

                  And yet, a lot of the time CAPTCHA insists that that mailbox is actually a parking meter.

            2. 3

              Prove we should allow us to advertise to you!

            3. 6

              The article addresses this toward the end:

              We also have to consider the possibility of facing automated button-pressing systems. A drinking bird able to press the capacitive touch sensor could pass the Cryptographic Attestation of Personhood. At best, the bird solving rate matches the time it takes for the hardware to generate an attestation. With our current set of trusted manufacturers, this would be slower than the solving rate of professional CAPTCHA-solving services, while allowing legitimate users to pass through with certainty.

              1. 3

                Essentially they are relying on the required physical presence mechanisms of FIDO security keys to limit the rate at which the challenges can be passed. So essentially having a key does not prove you are human but attempts to constrain anyone using it to the challenge passing rate roughly attainable by a human. Since these keys are issued by trusted authorities I imagine this means they probably have some mechanism implemented or planned that would ban keys with superhuman challenge rates.

              2. 2

                “Prove you are human” is begging the question “to whom?”

                This captcha timeline is the fastest route to an internet driving license and authoritarianism. It’s a really sensitive issue and I think that’s why it’s been left largely untouched.

                Obviously things are not sustainable as they are but this “solution” is just shuffling things around slightly, it doesn’t change the nature of the problem.

                Identity systems are complicated because we believe they must be p2p (democratic countries at least..) but clearly this is a byzantine setting…

                Using authenticated datastructures (also known as proof carrying data) is the best way to program such systems but we need to make this way of programming more accessible / extensible.

                I believe it all boils down to data interchange formats, json is inherently hard to canonicalize and it is too easy to have messages that are equivalent but not equal as data.

                We will need (canonical) domain-specific compilers to canonicalize data and deduplicate equivalent things. Guix and nix are already developing this timeline.

                Content addressable networks are the future. I am betting on canonical S-expressions (by working on my poorly explained project datalisp.is) but other approaches are also interesting (like the RON format - stands for replicated object notation iirc).

                1. 1

                  I don’t get why you’re saying that data formats would somehow allow us to prove our identity? Is this a bot-generated comment?

                  1. 1

                    Lol, yeah the link is not obvious I guess. The point is that digital identity is a cryptographic concept and cryptography relies on canonical data.

                    In order to leverage cryptography properly we need to think about encoding.

                    W.r.t. the identity problem and Sybil resistance I am claiming that we have no solution as-is so we need to make the kind of programming that will lead to a solution more approachable. That way we can start exploring the problem space faster.

                2. 2

                  Tried this on Safari with TouchID (which normally works like a security key…) and it didn’t work :(

                  anyone else have any luck?

                  1. 3

                    It looks like they only support a couple of security key manufacturers, with yubikey being the biggest. I doubt TouchID provides the kind of manufacturer attestation needed for this scheme (but I could be wrong about that).

                    1. 1

                      Yeah, they say that they only support attestations by Yubikey, HyperFIDO and Thetis FIDO. TouchID probably provides attestation(though I’m not sure), but it just hasn’t been whitelisted by them yet.

                      1. 2

                        Apple does have an attestation scheme for TouchID, but it’s not the “standard” one. It’s anonymous and can’t be tracked, which probably isn’t desirable for Cloudflare’s use. Presumably they are misusing this feature so they can block “bad” users, which Apple’s feature doesn’t let them do.

                        Ctrlf for Apple Anonymous Attestation on https://webkit.org/blog/11312/meet-face-id-and-touch-id-for-the-web/

                        1. 1

                          You can’t “block bad users” as is right now. Each attestation key is used in at least 100,000 tokens, there’s no reasonable way to block a single one of them with the way it’s done. Apple’s way meanwhile, is quite a bit more complicated, requires connection to Apple’s servers from your machine, and creates a new attestation certificate each time that is signed by “master” Apple’s certificate on their servers (and seems like it’s opt-in?). I’m not entirely sure if there’s much difference in the privacy front besides Apple not having to worry about somebody extracting attestation keys from their machines and spoofing their attestation.

                          1. 2

                            I think 1 in 100k, combined with additional signals like client fingerprinting, IP, etc, is absolutely enough to identify and block a bot. Even in the worst case where you block whole batches of yubikeys, the attacker cost goes up as they buy more keys, but legitimate users just fall back to captchas.

                            1. 1

                              The whole point of this for them was to decrease their CAPTCHA usage. Turning users back to using them is counterproductive for them. 1 in 100k is a tiny amount, and with carefulness, a bot writer can easily blend into a group that size.

                              1. 1

                                Most of that 100k set of users will not be visiting any particular website at a time.

                                If the point of this isn’t to block bad boys, then what is it? Bot writers will have a yubikey-as-a-service API from somebody soon, probably using a rotating set of some dozens of security keys. So it’ll be even easier for bots than captchas are today, if cloudflare isn’t using the key batch as a signal to block.

                  2. 2

                    What a horrible idea.

                    Based on our data, it takes a user on average 32 seconds to complete a CAPTCHA challenge.

                    I bet I’m faster, unless Google decides to give me multiple.

                    The user plugs the device into their computer or taps it to their phone for wireless signature (using NFC).

                    And this is faster than 32 seconds? I actually have a yubikey attached to the computer I’m working on but it’s kinda out of reach because I don’t need to hit it very often. Also what about browser that won’t work with this? (Used to be even Firefox for a long time, so just Chrome), also what about VMs or remote machines where you don’t have a hardware key? And on my main machine at home I only plug the (other) key in when I log in to one of the 2fa services that support it. Like.. 3-5 of them. It’s usually sitting on my keyring in another room…

                    Maybe I’m again the stereotypical “I have all the exceptions here” person but I don’t see how this would make anything easier for me.

                    1. 1

                      I bet I’m faster, unless Google decides to give me multiple.

                      Yeah, because if you’re on linux, and not logged in to a google account, you’re getting multiple passes of the slow-fading captcha. And only if you’re lucky, you’ll be able to pass it - most of the time, it’s better to just give up and close the tab.

                    2. 1

                      Yeah. When billions of people for years have tried to help Google’s AI to tell the difference between crosswalks and traffic lights, and it obviously don’t catch on, it’s time to call it the day.

                      1. 1

                        But… why are security keys hardware instead of software? Is it possible to reverse engineer these hardware keys and reimplement them with software?

                        1. 6

                          There’s no need for reverse engineering, the specs are public. GitHub maintains a software U2F emulator, which is a software emulation of the USB protocol using the userspace device driver infrastructure. It stores the keys in the macOS keychain (which, on recent Macs, is actually protected by a separate security chip). On Windows, WebAuthn via Windows Hello is backed by the TPM, so it’s a separate chip that stores the keys.

                          The extra thing on top of this seems to be that the manufacturers sign the keys in the devices with a claim that they do some rate limiting. That doesn’t really seem to take the economics into account though. The most common attack on CAPTCHAs is to pay folks on Mechanical Turk to solve them. A U2F device costs about $20. If it can let you in every 10 seconds, then it’s three times faster than a human doing the same thing (32 seconds to solve a CAPTCHA on average, according to the article) and that’s $20 for the entire lifetime of the device. A bank of them in a datacenter could easily undercut human CAPTCHA solvers.

                          1. 1

                            Problem is that it is a lot harder to forward FIDO2 tokens than CAPTCHAs. For CAPTCHAs you need a simple VNC. For security keys, you would probably need to forward USB devices. Possible, but a lot harder to automate. Or, you run your bot’s on the Mechanical Turk workers computers, which probably wouldn’t go well.

                            Oh, and the keys probably don’t purposefully rate limit, it’s just that their hardware is slow enough that it takes a second to generate and sign the attestation.

                            1. 3

                              For security keys, you would probably need to forward USB devices. Possible, but a lot harder to automate

                              I don’t think it is. You need to forward the challenge-response pairs, but they’re just values in an HTTP request. We have a lot of infrastructure for forwarding HTTP requests between computers…

                              1. 1

                                And rewrite a good bunch of browser FIDO stack if you want to go that way then. Probably catching Webauthn calls would be the entry point, from which you send the parameters over to the program on the computer with the token which actually does the requests on the key. Still, far from a ready-made solution that VNC is.

                                1. 5

                                  I am trying to understand the system you’re imagining. I’m assuming someone trying to attack a system and needing to bypass CAPTCHAs. The way that they do this today is to have a custom client with a modified WebKit / Blink that pulls out the CAPTCHA queries, presents them via an HTTP interface to folks on MTurk, and then forwards the responses. The way that they’d do it with the WebAuthn model would be to have a load of U2F devices attached to a single machine and send the WebAuthn requests to each one. The infrastructure is much simpler with the WebAuthn / U2F model.

                                  1. 1

                                    The browser FIDO stack is a new feature. Initially this was offered (and is still offered) as a set of libraries.

                                    Websites interact with a rather simple javascript API to achieve this whole thing and it’s really not hard to just scrape the calls out of a response and put that data into the C APIs.

                            2. 1

                              Is it possible to reverse engineer these hardware keys and reimplement them with software?

                              It’s definitely possible to have software versions - back in 2010 or thereabouts I was contracting for a company who used RSA fobs to control remote access and I had a software version rather than an actual hardware fob.

                              1. 1

                                Not sure about this case, but hardware keys such as HASP, Sentinel (in 90s, 2000s) were a pretty popular target for crackers to create a software-emulated key that would allow running the application without the physical dongle attached.

                                I’m not sure how similar is this solution, but I wouldn’t bet any money to assume it’s secure by default.

                              2. 1

                                I didn’t really see anything that would prevent this solution to be automated. Maybe it’s the implementation details that have the actual protection, but this article didn’t seem to present it.

                                I’m also skeptical because it’s Cloudflare. “Use our solution instead of competitor’s” – this is what this article means to me.

                                Also, I understand that captchas are a PITA for the users, but I don’t really see a different solution for the bot problem from the administrative side. Having broken some captchas in the past I know that very often it’s pretty easy to automate something on the web, even if some “protections” like e-mail verification take place. Captcha is sometimes the only “real” solution that doesn’t involve methods invading user privacy (like giving out your mobile phone number). From this point of view I think that captchas are actually solving a real problem.

                                I had some public websites since years, and bot problems are always a threat. Without any protection it’s just a matter of time to have the website spammed with links. With generic protection it’s often enough to defend against mass-discovery-and-spam bots, but generic protection is worthless against directed attacks. It’s also true that it’s possible to hire a captcha-solving company to break through captchas as well, but the cost of operation significantly increases in this case.

                                1. 1

                                  CaptchaBuster to save us all!