1. 16
  1. 16

    You can configure your web browser to not send User-Agent HTTP header at all – it is not mandatory.

    Removing my user agent also removed my ability to visit Cloudflare-encumbered websites, so I’d say it IS mandatory on large parts of the internet.

    1. 10

      It also causes lobste.rs to return a 500 error; and Netflix to throw you to a help article about outdated browsers, not letting you even try to use the player.

    2. 5

      I think the only way for something like this to get traction is to use a different header. Stuffing it in the existing User-Agent header instead of the current mess is impractical until you get a critical mass of servers to start replacing their old, pathological uses of User-Agent with uses of the URI. And many of the servers that are leaning on the old header are just the ones that won’t be making many changes.

      Using a new header has potential, though. If you could get all the browsers to start emitting it, once it became common you could start turning off the old one and filing bugs with sites that break. It’d still take ages before most everyday browsing could leave the old UA header turned off. But you might get there that way.

      1. 2

        Changing the header name is the easiest thing and I do not say no… But maybe it would end like dual-stack IPv4 + IPv6 where transition to “IPv6-only” is quite rare.

        1. 2

          I came here to suggest this, too. There’s just too much legacy code out there that does UA detection, esp. ancient corporate and government websites. user-agent-uri or agent-uri or uri-user-agent would make it far more actionable.

          Maybe there’s an opportunity for some server action, too. Perhaps a server can respond to an initial request, or perhaps a .well-known URL, with something like user-agent-header containing user-agent or b/v for old style or user-agent-uri or uri for the new, URI way.

          It’d still take ages before most everyday browsing could leave the old UA header turned off.

          I’ll call it 15 years.

          1. 2

            It’d still take ages before most everyday browsing could leave the old UA header turned off.

            I’ll call it 15 years.

            Yeah. That might even be conservative. The first code that I personally wrote with IPv6 support, because everyone would definitely soon be using IPv6, is 24 years old now.

            That’s a big part of what makes me say a new header would help in this case, though. If we have to re-use the old header, ossification will keep most everyone from doing it. @franta: If we can use a new header, and this new header provides (as detailed in the proposal) better information than the old one did, you will get people using the new one if the browsers ship it. Because it’s more useful for them.

            And that gives you a chance to clear the low bar set by IPv6.

            1. 2

              The first code that I personally wrote with IPv6 support, because everyone would definitely soon be using IPv6, is 24 years old now.

              This is so depressing; what a frustrating failure IPv6 promotion has been. There seems to have been a big jump recently: this article from March 2022 shows Google says 33.96% while the current Google IPv6 adoption stat as of May 1, 2022 says 39.46%. It looks like there’s a 4-5% of total regular fluctuation, though, and we’re at a peak that is also an all-time high.

            2. 1

              What about this fallback logic?

              1. Check the User-Agent-URI header and if present, use its value and ignore User-Agent header.
              2. Check the User-Agent header and if it is valid URI, use it according to the User-Agent URI specification.
              3. Read the User-Agent header the old way (try to parse the mess).

              So brave browsers/users could put URI in the User-Agent header immediately, while conservative browsers/users could put URI in the new User-Agent-URI header and put the original mess in the User-Agent as everyone is used to.

              1. 1

                I missed this reply before… I think this is a reasonable way to set up a transition that could work.

          2. 4

            User agents are a mess, I agree. Up to the point that it was said that three digit browser numbers would break UA parsing in some libraries. This looks like a decent first step towards standardizing UA strings.

            1. 4

              The URI is a great idea. I suggest using an HTTPS URL that gives you an expanded view into what the device is.

              I have a side project where I’m doing some HTML scraping. I anticipate that some sysadmins might get pissed at the load generated by the tool, so I’m setting the User-Agent to a URL that explains what it is and some actions a sysadmin can take if they don’t appreciate the traffic coming from my tool.

              1. 2

                I’d like to see an example of what that would look like. Is an URI the best way to serialize a messy structure? It might, but also there are better ways to represent variations than a flattened list of things [scheme, authority, path, query, fragment]. Of course you should be able to encode (almost) anything in an URI, but it could look terrible as well.

                I guess as soon as we have a standard way to parse it, we won, whatever the format.

                1. 3

                  It should generally look like this:

                  user-agent://cool-browser.example.com/?bv=100&e=gecko&d=d&x=1920&y=1080
                  

                  or this in case of well-known browsers:

                  user-agent://ff/?bv=100&e=gecko&d=d&x=1920&y=1080
                  

                  The URL/URI format was chosen because these parsers are already used in the web ecosystem (the idea is to do not introduce a new format or require a new parser dependency). And another reason is that URI serves as a globally unique identifier and can be referred to e.g. from RDF (however, this is just a side-effect).