1. 6

    I find it curious that the Blink team at Google takes this action in order to prevent various other teams at Google from doing harmful user-agent sniffing to block browsers they don’t like. Google certainly isn’t the only ones, but they’re some of the biggest user-agent sniffing abusers.

    FWIW, I think it’s a good step, nobody needs to know I’m on Ubuntu Linux using X11 on an x86_64 CPU running Firefox 74 with Gecko 20100101. At most, the Firefox/74 part is relevant, but even that has limited value.

    1. 14

      They still want to know that. The mail contains a link to the proposed “user agent client hints” RFC, which splits the user agent into multiple more standardized headers the server has to request, making “user-agent sniffing” more effective.

      1. 4

        Oh. That’s sad. I read through a bit of the RFC now, and yeah, I don’t see why corporations wouldn’t just ask for everything and have slightly more reliable fingerprinting while still blocking browsers they don’t like. I don’t see how the proposed replacement isn’t also “an abundant source of compatibility issues … resulting in browsers lying about themselves … and sites (including Google properties) being broken in some browsers for no good reason”.

        What possible use case could a website have for knowing whether I’m on ARM or Risc-V or x86 or x86_64 other than fingerprinting? How is it responsible to let the server ask for the exact model of device you’re using?

        The spec even contains wording like “To set the Sec-CH-Platform header for a request, given a request (r), user agents MUST: […] Let value be a Structured Header object whose value is the user agent’s platform brand and version”, so there’s not even any space for a browser to offer an anti-fingerprinting setting and still claim to be compliant.

        1. 4

          What possible use case could a website have for knowing whether I’m on ARM or Risc-V or x86 or x86_64 other than fingerprinting?

          Software download links.

          How is it responsible to let the server ask for the exact model of device you’re using?

          … Okay, I’ve got nothing. At least the W3C has the presence of mind to ask the same question. This is literally “Issue 1” in the spec.

          1. 3

            Okay, I’ve got nothing.

            I have a use case for it. I’ve a server which users run on a intranet (typically either just an access point, or a mobile phone hotspot), with web browsers running on random personal tablets/mobile devices. Given that the users are generally not technical, they’d probably be able to identify a connected device as “iPad” versus “Samsung S10” if I can show that in the web app (or at least ask around to figure out whose device it is), but will not be able to do much with e.g an IP address.

            Obviously pretty niche. I have more secure solutions planned for this, however I’d like to keep the low barrier to entry that knowing the hardware type from user agent provides in addition to those.

          2. 2

            What possible use case could a website have for knowing whether I’m on ARM or Risc-V or x86 or x86_64 other than fingerprinting?

            Benchmarking and profiling. If your site performance starts tanking on one kind of processor on phones in the Philippines, you probably want to know that to see what you can do about it.

            Additionally, you can build a website with a certain performance budget when you know what your market minimally has. See the Steam Hardware and Software Survey for an example of this in the desktop videogame world.

            Finally, if you generally know what kinds of devices your customers are using, you can buy a bunch of those for your QA lab to make sure users are getting good real-world performance.

        2. 7

          Gecko 20100101

          Amusingly, this date is a static string — it is already frozen for compatibility reasons.

          1. 2

            Any site that offers you/administrators a “login history” view benefits from somewhat accurate information. Knowing the CPU type or window system probably doesn’t help much, but knowing it’s Firefox on Ubuntu combined with a location lookup from your IP is certainly a reasonable description to identify if it’s you or someone else using the account.

            1. 2

              There are terms I’d certainly like sites to know if I’m using a minority browser or a minority platform, though. Yes, there are downsides because of the risk of fingerprinting, but it’s good to remind sites that people like me exist.

              1. 1

                Though the audience here will play the world’s tiniest violin regarding for those affected the technical impact aspect may be of interest.

                The version numbering is useful low-hanging-fruit method in the ad-tech industry to catch fraud. A lot of bad actors use either just old browsers[1] or skew browser usage ratios; though of course most ‘fraud’ detection methods are native and just assume anything older than two major releases is fraud and ignore details such as LTS releases.

                [1] persuade the user to install a ‘useful’ tool and it sits as a background task burning ads or as a replacement for the users regular browser (never updated)

              1. 1

                Samsung M2022W Black&White laser. Never really got the wifi all figured out, so connected to a Ubuntu machine via USB.

                1. 15

                  Maybe some folk don’t understand what’s going on here, but this is in direction violation of Postel’s law:

                  They’re blocking access from old devices for absolutely no technical reason; they’re blocking read-only access from folks that might not have any other devices at their disposal.

                  If you have an old iPod lying around, why on earth should you not be able to read Wikipedia on it? Absolutely no valid technical reason to deny access. Zilch. None. Nada.

                  There’s no reason it shouldn’t be possible to read Wikipedia over straight HTTP, for that matter.

                  1. 9

                    I know next to nothing about security so correct me if I’m wrong, but doesn’t leaving old protocols enabled make users vulnerable to downgrade attacks?

                    1. 14

                      You’re applying bank-level security to something that’s public information and should be accessible to everyone without a licence or access control in the first place. I don’t even know what sort of comparison to make here best, because in my view requiring HTTPS in the first place here was a misguided decision that’s based on politics, corporate interests and fear, not on rational facts. Postel’s law is also a well-known course of action in telecommunication, even Google still follows it — www.google.com still works just fine over straight HTTP, as does Bing, no TLS mandated from those who don’t want it.

                      1. 5

                        I agree with you, I’d like to be able to access Wikipedia with HTTP, but this is in my opinion a different issue from disabling old encryption protocols.

                        Accessing Wikipedia with secure and up to date protocols might not be necessary to you but it might be for people who live under totalitarian regimes. One could argue that said regimes have better ways to snoop on their victims (DNS tracking, replacing all certificates with one they own…) but I still believe that if enforcing the use of recent TLS versions can save even a single life, this is a measure worth taking. It would be interesting to know if Wikipedia has data on how much it is used by people living in dictatorships and how much dropping old TLS versions would help these people.

                        1. 4

                          totalitarian regimes

                          It’s funny you mention it, because this actually would not be a problem under a totalitarian regime with a masquerading proxy and a block return policy for the https port and/or their own certificates and a certificate authority. See https://www.xkcd.com/538/.

                          Also, are you suggesting that Wikipedia is basically blocking my access for my own good, even though it’s highly disruptive to me, and goes against my own self-interests? Yet they tell me it is in my own interest that my access is blocked? Isn’t that exactly what a totalitarian regime would do? Do you not find any sort of an irony in this situation?

                          1. 3

                            “Isn’t that exactly what a totalitarian regime would do?”

                            I think you may have overstated your case here.

                            1. 2

                              this actually would not be a problem under a totalitarian regime with a masquerading proxy and a block return policy for the https port and/or their own certificates and a certificate authority.

                              Yes, this is what I meant when I wrote “One could argue that said regimes have better ways to snoop on their victims”.

                              Also, are you suggesting that Wikipedia is basically blocking my access for my own good

                              No, here’s what I’m suggesting: there are Wikipedia users who live in countries where they could be thrown in jail/executed because of pages they read on Wikipedia. These users are not necessarily technical, do not know what a downgrade attack is and this could cost them their lives. Wikipedia admins feel they have a moral obligation to do everything they can to protect their lives, including preventing them from accessing Wikipedia if necessary. This is a price they are willing to pay even if it means making Wikipedia less convenient/impossible to use for other users.

                        2. 1

                          If they left http, yeah, sure. But I don’t think any attack that downgrades ssl encryption method exists, both parties always connect using the best they have. If there exists one, please let me know.

                          There is no technical reason I’m aware of. Why does wikipedia do this? It’s not like I need strong encryption to begin with, I just want to read something on the internet.

                          I still have usable, working smartphone with android Gingerbread, it’s the first smartphone I ever used. It’s still working flawlessly and I’m using it sometimes when I want to quickly find something when my current phone has no battery and I don’t want to turn on my computer.

                          This move will for no reason kill my perfectly working smartphone.

                          1. 9

                            But I don’t think any attack that downgrades ssl encryption method exists,

                            Downgrade attacks are possible with older versions of SSL e.g. https://www.ssl.com/article/deprecating-early-tls/

                            It’s not like I need strong encryption to begin with, I just want to read something on the internet.

                            Which exact page you’re looking at may be of interest, e.g. if you’re reading up on medical stuff.

                            1. 1

                              Which exact page you’re looking at may be of interest, e.g. if you’re reading up on medical stuff.

                              Are you suggesting that we implement access control in public libraries, so that noone can browse or checkout any books without strict supervision, approvals and logging by some central authority? (Kinda like 1984?)

                              Actually, are you suggesting that people do medical research and trust information from Wikipedia, literally edited by anonymous people on the internet?! HowDareYou.gif. Arguably, this is the most misguided security initiative in existence if thought of in this way; per my records, my original accounts on Wikipedia were created before they even had support for any TLS at all; which is not to say it’s not needed at all, just that it shouldn’t be a mandatory requirement, especially for read-only access.

                              P.S. BTW, Jimmy_Wales just responded to my concerns — https://twitter.com/jimmy_wales/status/1211961181260394496.

                              1. 10

                                Are you suggesting that we implement access control in public libraries, so that noone can browse or checkout any books without strict supervision, approvals and logging by some central authority? (Kinda like 1984?)

                                I’m saying that you may not wish other people to infer what medical conditions you may have based on your Wikipedia usage. So TLS as the default is desirable here, but whether it should be mandatory is another question.

                                1. 2

                                  Are you suggesting that we implement access control in public libraries, so that noone can browse or checkout any books without strict supervision, approvals and logging by some central authority? (Kinda like 1984?)

                                  PSST, public libraries in the western world already do this to some extent. Some countries are more central than others thanks to the US PATRIOT Act.

                                  1. 1

                                    public libraries in the western world

                                    Not my experience at all; some private-university-run libraries do require ID for entry; but most city-, county- and state-run libraries still allow free entry without having to identify yourself in any way. This sometimes even extends to making study-room reservations (can often be made under any name) and anonymous computer use, too.

                              2. 8

                                I still have usable, working smartphone with android Gingerbread, it’s the first smartphone I ever used. It’s still working flawlessly and I’m using it sometimes when I want to quickly find something when my current phone has no battery and I don’t want to turn on my computer.

                                This move will for no reason kill my perfectly working smartphone.

                                It’s not working flawlessly, the old crypto protocols and algorithms it uses have been recalled like a Takata airbag, and you’re holding on because it hasn’t blown up in your face yet.

                                1. 2

                                  This move will for no reason kill my perfectly working smartphone.

                                  (my emphasis)

                                  So you just use this phone to access Wikipedia, and use it for nothing else?

                                  If so, that’s unfortunate, but your ire should be directed to the smartphone OS vendor for not providing needed updates to encryption protocols.

                                  1. 2

                                    our ire should be directed to the smartphone OS vendor for not providing needed updates to encryption protocols

                                    I think it’s pretty clear that the user does not need encryption in this use-case, so, I don’t see any reason to complain to the OS vendor about encryption when you don’t want to be using any encryption in the first place. Like, seriously, what sort of arguments are these? Maybe it’s time to let go of the politics in tech, and provide technical solutions to technical problems?

                                    1. 1

                                      As per my comment, I do believe that the authentication provisions of TLS are applicable to Wikipedia.

                                      Besides, the absolute outrage if WP had not offered HTTPS would be way bigger than now.

                              3. 15

                                I find the connection to Postel’s law only weak here, but in any case: This is the worst argument you could make.

                                It’s pretty much consensus among security professionals these days that Postel’s law is a really bad idea: https://tools.ietf.org/html/draft-iab-protocol-maintenance-04

                                1. 3

                                  I don’t think what passes for “postel’s law” is what Postel meant, anyway.

                                  AFAICT, Postel wasn’t thinking about violations at all, he was thinking about border conditions etc. He was the RFC editor, he didn’t want anyone to ignore the RFCs, he wanted them to be simple and easy to read. So he wrote “where the maximum line length is 65” and meant 65. He omitted “plus CRLF” or “including CRLF” because too many dotted i’s makes the prose heavy, so you ought to be liberal in what you accept and conservative in what you generate. But when he wrote 65, he didn’t intend the readers to inter “accept lines as long as RAM will allow”.

                                  https://rant.gulbrandsen.priv.no/postel-principle is the same argument, perhaps better put.

                                  IMO this is another case of someone wise saying something wise, being misunderstood, and the misunderstanding being a great deal less wise.

                                  1. 2

                                    I can’t really understand advocating laws around protocols except for “the protocol is the law”. Maybe you had to be there at the time.

                                  2. 6

                                    As I understand it, they’re protecting one set of users from a class of attack by disabling support for some crypto methods. That seems very far from “absolutely no technical reason”.

                                    As for HTTP, if that were available, countries like Turkey would be able to block Wikipedia on a per-particle basis, and/or surveil its citizens on a per-article basis. With HTTPS-only, such countries have to open/close Wikipedia in toto, and cannot surveil page-level details. Is that “no reason”?

                                    1. 1

                                      As for HTTP, if that were available, countries like Turkey would be able to block Wikipedia on a per-particle basis, and/or surveil its citizens on a per-article basis. With HTTPS-only, such countries have to open/close Wikipedia in toto, and cannot surveil page-level details. Is that “no reason”?

                                      I don’t understand why people think this is an acceptable argument for blocking HTTP. It reminds me of that jealous spouse scenario where someone promises to inflict harm, either to themselves or to their partner, should the partner decide to leave the relationship. “I’ll do harm if you censor me!”

                                      So, Turkey wants to block Wikipedia on a per-article business? That’s their decision, and they’ll go about it one way or another, I’m sure the politicians they don’t particularly care about the tech involved anyways (and again, it’s trivial for any determined entity to block port 443, and do a masquerade proxy on port 80, and if this is done on all internet connections within the country, it’ll work rather flawlessly, and noone would know any better). So, it’s basically hardly a deterrent for Turkey anyways. Why are you waging your regime-change wars on my behalf?

                                      1. 1

                                        Well, Wikipedia is a political project, in much the same way that Stack Overflow is. The people who write have opinions on whether their writings should be available to people who want to read.

                                        You may not care particularly whether all of or just some of the information on either Wikipedia or SO are available to all Turks, but the people who wrote that care more, of course. They wouldn’t spend time writing if they didn’t care, right? To these people, wanting to suppress information about the Turkish genocide of 1915 is an affront.

                                        So moving to HTTPS makes sense to them. That way, the Turkish government has to choose between

                                        • allowing Turks to read about the genocide
                                        • not allowing Turks any use of Wikipedia

                                        The Wikipedians are betting that the second option is unpopular with the Turks.

                                        It’s inconvenient for old ipad users, but if you ask the people who spend time writing, I’m sure they’ll say that being able to read about your country’s genocide at all is vastly more important than being able to read using old ipads.

                                    2. 4

                                      I can think of several reasons:

                                      • not letting people know what you are reading
                                      • not letting people censor some articles
                                      • not letting people modify some articles (for example putting an incorrect download link for a popular software without being detected)
                                      • making an habit that everything should be HTTPS (for example for people to not be fooled by phishing sites with the lockpad in the URL bar)
                                      1. 2

                                        So what’s to stop a totalitarian regime from doing the following?

                                        • Redirect all DNS queries to their own DNS servers? The root DNS servers use fixed IP addresses, so it would be easy enough to reroute those addresses to return any address they want.
                                        • Redirect all DoH to 1.1.1.1 (or other well known DoH addresses) to again, their own server? Is the CloudFlare public key installed on all browsers? How would you know you are hitting CloudFlare, and not TotallyCloudFlare served by TotallyLegitCA?
                                        • Given control over DNS, redirect users to TotallyWikipedia? Again, do you know what CA Wikipedia uses? They can then decode (doesn’t matter if it’s SSL/1.0 or TLS/1.3) the request and proxy it or send out security to question the loyalty of the citizen. Or you know, download the entirety of Wikipedia (which anyone can do), and serve up a cleaned up version to their citizens.
                                        1. 1

                                          The difficulty is to setup/enrole TotallyLegitCA. How do you do that? If TotallyLegitCA is public, the transparency log will quickly reveal what they are doing. The only way to pull that seems to force people to have your CA installed, like Kazakhstan is doing.

                                          1. 2

                                            We’re talking about a totalitarian regime (or you know, your standard corporation who install their own CA in the browser).

                                      2. 3

                                        That’s actually incorrect. There are various technical reasons. But also remember that they need to operate on a vast scale as a non-profit. This is hard.

                                        Here are some technical reasons. I’m sure others will chime in as there are likely many more.

                                        • some attacks on TLSv1.0 can compromise key material which is used for the newer, secure versions of TLS
                                        • attacks only get better
                                        • removing old code reduces complexity
                                        1. 0

                                          providing a read-only version without login over HTTP shouldn’t really add any new code except they’d be on a HTTP-2-only webserver if I’m not mistaken.

                                        2. 2

                                          There are arguments for an inverse-postel’s law given in https://m.youtube.com/watch?v=_mE_JmwFi1Y

                                          1. 0

                                            But I hear all the time that I must ensure my personal site uses HTTPS and that soon browsers will refuse to connect to “insecure” sites. Isn’t this a good thing Wikipedia is doing? /s

                                            Edit also see this discussion: https://lobste.rs/s/xltmol/this_page_is_designed_last#c_keojc6

                                            1. 7

                                              I have HTTPS on my completely static website mostly so that no one asks why I don’t have HTTPS, but on the other hand, the “completely static” part is only relevant as long as there are only Eves in the middle and no Mallories.

                                              If serving everything over HTTPS will make the life of ISPs injecting ads and similar entities harder, it’s a good thing, until there’s a legal rather than technical solution to that.

                                              1. 2

                                                I actually think that HTTPS is reasonable for Wikipedia, if for nothing else to hinder 3rd parties for capture your embarrassing edits to “MLP: FIM erotica” and tracing it to back to you. For a static, read-only site it just adds cost and/or a potential point of failure.

                                                1. 1

                                                  For a static, read-only site it just adds cost and/or a potential point of failure.

                                                  dmbaturin just said what the value add is. HTTPS prevents third parties from modifying the content of your static site.

                                          1. 1

                                            This is a great solution if you are already running prometheus, or if you are interested in doing so. I do like the simplicity of hchk.io for cases where I don’t want to run prometheus (and related services/tooling like grafana, and push-gateway).

                                            Great idea and writeup though! Next time I have to run prometheus at a job, I’ll definitely keep this in mind for tracking the errant cron jobs that always seems to sneak in there somewhere.

                                            1. 1

                                              As I mentioned in https://blog.bejarano.io/alertmanager-alerts-with-amazon-ses/#sup1, I do not run Grafana or any dashboarding because I consider it worthless and time-consuming to set up.

                                              Thanks for the feedback!

                                              1. 1

                                                At a small scale the expression browser is sufficient (I use it for most of my work), but once you get beyond that something like Grafana is essential.

                                            1. 4

                                              I run Prometheus at home, though I’m obviously a bit biased there.

                                              1. 3

                                                I’m not biased and I run Prometheus at home, and elsewhere. Blackbox Exporter running on RPis in various physical/network locations, with ZeroTier. Most of the Blackbox Exporter target configuration uses DNS service discovery. Alert Manager for alerting. I’ve used many different monitoring systems and recommend Prometheus+Grafana, with Netdata for some low-level monitoring.

                                              1. 1

                                                People is suggesting keeping your gmail account “alive” for a while, but in the case of that account being bound to something that you own, like your Git commits somewhere, it means that you’ll have to keep that account safe, forever.

                                                I have two questions:

                                                • Is there a way of changing your commit history to reflect to a new email address that does not belong to a centralized corporation but to you, in the form of a domain you own.
                                                • Is it possible to use another identification mechanism, a signature that is not bound to an email address? An email address requires infrastructure to work, and that eventually could belong to someone else, like your the domain your email is part of
                                                1. 2

                                                  Is there a way of changing your commit history to reflect to a new email address that does not belong to a centralized corporation but to you, in the form of a domain you own.

                                                  Yes in theory, however that changes all the hashes so no in practice.

                                                  1. 2

                                                    in my experience, just start committing with the new address and update any mailmap and authors files. can’t do anything about published history…

                                                    1. 1

                                                      You could use git filter-branch to rewrite the entire git repository to replace your old e-mail address with your new one, but that will change the hash of every commit so it will be a terrible experience for anyone who has an existing clone of your repository. I think it’s not worth it.

                                                      1. 1

                                                        Is it possible to use another identification mechanism, a signature that is not bound to an email address? An email address requires infrastructure to work, and that eventually could belong to someone else, like your the domain your email is part of

                                                        In GitHub, you can choose to keep your email private and use something to the tune of username@users.noreply.github.com. See the details here

                                                      1. 14

                                                        I still can’t get over the fact that someone got an idea to refresh HTML document tree 60 times per second and make HTML document viewer render it over and over and call it as “application”.

                                                        It’s so wrong on just too many levels that I don’t even know where to start, and people basically just don’t even notice that.

                                                        1. 11

                                                          But it doesn’t actually work the way? In AJAX apps, DOM is only updated on events (e.g. click, user network data received). You would have to have a timer to actually update it regularly in the background.

                                                          Probably the place it gets close to that is when they hijack the scroll event, which is horrible. But even that’s not changing the DOM if you’re not scrolling.

                                                          FWIW I agree with the premise of the OP, but I don’t think your criticism is accurate.

                                                          1. 7

                                                            It’s not the first time that someone got an idea to build GUIs by just extending existing document-rendering technology…

                                                            1. 9

                                                              DPS is a little bit different, because postscript is a programming language (specifically, a forth dialect with logo/turtle-style pen control). It’s relatively sensible to do widget-drawing with a language optimized for drawing arbitrary line graphics. A web app is more like trying to use dot macros to modify an MS word document at 30fps.

                                                              1. 9

                                                                A web app is more like trying to use dot macros to modify an MS word document at 30fps.

                                                                That reminds me, years ago my dad, who was a chemical engineer in a large company, showed me a simulation he’d gotten as an email attachment from a colleague. It had a pretty decent graphical animation entirely within an Excel spreadsheet. Part of the sheet was a “normal” spreadsheet with the actual formulas, but another part had cells resized to be small and rectangular, and their colors were changed a few times a second by macros, producing a medium-resolution raster-graphics display basically. This was apparently relatively common, because it made the graphical output self-contained within the same spreadsheet that you could mail around.

                                                            2. 7

                                                              I am not actually that offended by this idea, because most GUI applications are enhanced document viewers. But I do think that when your application requires to be run at 60 fps, you should use something else.

                                                              For example: The interoperability problem has already been solved with Java and if you really need something with more performance than that, you’d basically have to resort to lower level code like C/C++.

                                                              But if “a glorified document viewer and/or editor” all your application is, then an web-application will more than suffice.

                                                              1. 3
                                                                1. 5

                                                                  Web apps are a cool hack, and I absolutely love the perverse joy one gets from making something impressive using the absolute wrong tools for the job. But, the point of a cool hack is that the idea that somebody would use it seriously or for important tasks is absurd.

                                                                2. 3

                                                                  A developer equivalent of https://xkcd.com/763/

                                                                  1. 2
                                                                  1. 1

                                                                    I’m loving the “log stdout” part, everything else can basically be ignored.

                                                                    1. 2

                                                                      That’s definitely an improvement over the syslog situation, at least for our deployments. The native Prometheus export is neat as well; saves having to build an adapter to run alongside for metrics.

                                                                      1. 2

                                                                        I was partly joking. Not being able to log to stderr or stdout has caused so many problems because it’s basically impossible to debug haproxy without syslog being present (and HAProxy has the annoying tendence to stop logging if syslog hangs up such as happens when the network has a hiccup in an rsyslog sitaution)

                                                                        1. 1

                                                                          That exporter is one of the oldest: https://github.com/prometheus/haproxy_exporter

                                                                          1. 2

                                                                            Nope, this is a new, exporter-less endpoint, built into HAProxy itself: https://www.haproxy.com/blog/haproxy-exposes-a-prometheus-metrics-endpoint/

                                                                      1. 3

                                                                        Speaking as a Prometheus developer, it’s very easy to run Prometheus locally and I’ve done this in the past to debug both Prometheus (https://www.robustperception.io/optimising-go-allocations-using-pprof) and other applications. Most of the time I’m debugging the sort of issue that metrics aren’t suitable for though, so I’ll be print lining.

                                                                        1. 4

                                                                          https://www.robustperception.io/blog covers the Prometheus monitoring system, how to use it and why it is the way it is.

                                                                          1. 2

                                                                            Could one store the IP address of the initial request that causes you to generate a JWT in the token itself? Then you can validate that the current request comes from the same IP. If they’re different, then force them to log in again from their current IP.

                                                                            The user would need to re-login if they turn on a VPN or change locations, but that’s a small price to pay if that reduces the possibility for certain types of attacks. I’m definitely not a security expert, but working on a fairly sensitive app where a breach would be bad for a user. The fact that I haven’t seen this suggested next to more complex safeguards makes me think there’s a fundamental flaw in it that I’m just not thinking of.

                                                                            1. 5

                                                                              IPs aren’t a great factor to base stuff like this one, although that’s a good idea.

                                                                              I think what’s better is something like token binding (https://datatracker.ietf.org/wg/tokbind/documents/) which is a way to pin a certain token to a specific TLS session. This way you have some basic guarantees. But in the real world things are sorta messy =p

                                                                              1. 2

                                                                                Most home users would have to re log in every day. Services that tie my login to an IP address piss me off so much because they are constantly logging me out.

                                                                                1. 2

                                                                                  The fact that I haven’t seen this suggested next to more complex safeguards makes me think there’s a fundamental flaw in it that I’m just not thinking of.

                                                                                  It’s not a safe presumption that a users requests will always come from the same IP - even from request to request. Their internet access could be load balanced or otherwise change due to factors like roaming.

                                                                                  1. 1

                                                                                    Yeah that is also a common technique for cookies. If the remote IP changes you can invalidate the cookie.

                                                                                  1. 4

                                                                                    Nice article.

                                                                                    Beware that InstrumentHandler is deprecated, and the functions in https://godoc.org/github.com/prometheus/client_golang/prometheus/promhttp are the recommend replacement.

                                                                                    Splitting out latency with a success/failure label is also not recommended as a) if you have only successes or only failures, your queries break and b) users tend to create graphs of only success latency and miss all those slow failing requests. Separate success and failure metrics are better, and also easier to work with in PromQL.

                                                                                    1. 3

                                                                                      Thanks for the suggestions Brian! promhttp package contains even more nice things like in flight requests. Maybe we should explicitly say ok in docs that InstrumentHandler is deprecated in favor of promhttp types? I don’t mind making a PR in docs

                                                                                    1. -1

                                                                                      [Title] /proc/<pid>/stat is broken

                                                                                      This sounds serious! Is the content of the pseudo-file associating incorrect PIDs or parent PIDs to processes?

                                                                                      Let’s continue…

                                                                                      Documentation (as in, man proc) tells us to parse this file using the scanf family, even providing the proper escape codes - which are subtly wrong.

                                                                                      So it’s a documentation issue…

                                                                                      When including a space character in the executable name, the %s escape will not read all of the executable name, breaking all subsequent reads

                                                                                      I have literally never encountered an executable with a space in the name, although it’s perfectly legal from a file name perspective. (I’ve been a Linux user since 1998).

                                                                                      The only reasonable way to do this with the current layout of the stats file would be to read all of the file and scan it from the end […]

                                                                                      So… let’s do this instead?

                                                                                      The proper fix (aside from introducing the above function) however should probably be to either sanitize the executable name before exposing it to /proc//stat […]

                                                                                      Sounds reasonable to me.

                                                                                      […], or move it to be the last parameter in the file.

                                                                                      Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?

                                                                                      This problem could potentially be used to feed process-controlled data to all tools relying on reading /proc//stat

                                                                                      I can’t really parse this. Do you mean “affect” instead of “used”?

                                                                                      In conclusion: I can’t see any evidence of the functionality of this proc pseudo-file being “broken”. You have encountered an edge case (an executable name with a whitespace character in it). You’ve even suggested a workaround (scan from the end). If you had formulated this post as “here’s a workaround for this edge case” I believe you would have made a stronger case.

                                                                                      1. 5

                                                                                        I have literally never encountered an executable with a space in the name

                                                                                        Well, tmux does this, for example. But my primary concern is not has it ever happened to me but, if it happens, what will my code do?. As this is a silent failure (as in, the recommended method fails in a non-obvious way without indicating failure), no action is taken by most implementations to guard against this. That, in my mind, counts as broken, and the least thing to do is to fix the documentation. Or expose single parameters in files instead of a huge conglomeration with parsing issues. Or… see above.

                                                                                        So… let’s do this instead?

                                                                                        I do, but only after I got sceptical while reading the documentation, ran some tests and had my hunch confirmed. Then I checked to see others making that very mistake.

                                                                                        Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?

                                                                                        No, I don’t think so - except for introducing single-value files (and leaving /proc/<pid>/stats be as it is).

                                                                                        This problem could potentially be used to feed process-controlled data to all tools relying on reading /proc//stat

                                                                                        I can’t really parse this. Do you mean “affect” instead of “used”?

                                                                                        Admittedly, English is not my first language, I do however think that sentence parses just fine. The discussed problem (which is present in several implementations based on the documentation), can potentially be used to inject data (controlled by the process, instead of the kernel) into third-party software.

                                                                                        In conclusion: I can’t see any evidence of the functionality of this proc pseudo-file being “broken”.

                                                                                        That depends on your view of broken - if erroneous documentation affecting close to all software relying on it with a silent failure does not sound broken to you, I guess it is not.

                                                                                        You have encountered an edge case (an executable name with a whitespace character in it).

                                                                                        I actually did not encounter it per se, I just noticed the possibility for it. But it is an undocumented edge case.

                                                                                        You’ve even suggested a workaround (scan from the end).

                                                                                        I believe that is good form.

                                                                                        If you had formulated this post as “here’s a workaround for this edge case” I believe you would have made a stronger case.

                                                                                        Maybe, but as we can see by the examples of recent vulnerabilities, you’ll need a catchy name and a logo to really get attention, so in my book I’m OK.

                                                                                        1. 1

                                                                                          Thanks for taking the time to answer the questions I have raised.

                                                                                          The discussed problem (which is present in several implementations based on the documentation), can potentially be used to inject data (controlled by the process, instead of the kernel) into third-party software.

                                                                                          Much clearer, thanks.

                                                                                          On the use of “broken”

                                                                                          I’m maybe extra sensitive to this as I work in supporting a commercial software application. For both legal and SLA[1] we require our customers to be precise in their communication about the issues they face.

                                                                                          [1] Service level agreement

                                                                                          1. 1

                                                                                            Followup: can you give a specific example of how tmux does this? I checked the running instances of that application on my machine and only found the single word tmux in the output of stat files of the PIDs returned by pgrep.

                                                                                            1. 2

                                                                                              On my Debian 9 machine, when starting a tmux host session, the corresponding /proc/<pid>/stat file contains:

                                                                                              2972 (tmux: client) S 2964 2972 2964 […]

                                                                                        2. 3

                                                                                          “Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?”

                                                                                          I will never get the 100ms it took to read this sentence back….

                                                                                          1. 1

                                                                                            I dunno, maybe just duplicate the information at the end of the current format, in the author’s preferred format, and delimited by some character not otherwise part of the spec.

                                                                                            It’s not trivial, though.

                                                                                            That was my point.

                                                                                          2. 1

                                                                                            this was clearly overlooked when the api was designed, nobody is parsing that file from the end and nobody is supposed to

                                                                                            1. -1

                                                                                              What was overlooked? That executables can have whitespace in their names?

                                                                                              I can agree that this section of the manpage can be wrong (http://man7.org/linux/man-pages/man5/proc.5.html, search for stat):

                                                                                              (2) comm  %s
                                                                                                  The filename of the executable, in parentheses.
                                                                                                  This is visible whether or not the executable is
                                                                                                  swapped out.
                                                                                              

                                                                                              From the manpage of scanf:

                                                                                              s: Matches a sequence of non-white-space characters; the next
                                                                                                  pointer must be a pointer to the initial element of a
                                                                                                  character array that is long enough to hold the input sequence
                                                                                                  and the terminating null byte ('\0'), which is added
                                                                                                  automatically.  The input string stops at white space or at
                                                                                                  the maximum field width, whichever occurs first.
                                                                                              

                                                                                              So it’s clear no provision was made for executables having whitespace in them.

                                                                                              This issue can be simply avoided by not allowing whitespace in executable names, and by reporting such occurrences as a bug.

                                                                                              1. 8

                                                                                                This issue can be simply avoided by not allowing whitespace in executable names, and by reporting such occurrences as a bug

                                                                                                Ahhh, the Systemd approach to input validation!

                                                                                                Seriously, if the system allows running executables with whitespace in their names, and your program is meant to work with such a system, then it needs to work with executables with whitespace in their names.

                                                                                                I agree somewhat with the OP - the interface is badly thought out. But it’s a general problem: trying to pass structured data between kernel and userspace in plain-text format is, IMO, a bad idea. (I’d rather a binary format. You have the length of the string encoded in 4 bytes, then the string itself. Simple, easy to deal with. No weird corner cases).

                                                                                                1. 1

                                                                                                  I agree it’s a bug.

                                                                                                  However, there’s a strong convention that executables do not have whitespace in them, at least in Linux/Unix.[1]

                                                                                                  If you don’t adhere to this convention, and you stumble across a consequence to this, does this mean that a format that’s been around as long as the proc system is literally broken? That’s where I reacted.

                                                                                                  As far as I know, nothing crashes when you start an executable with whitespace in it. The proc filesystem isn’t corrupted.

                                                                                                  One part of it is slightly harder to parse using C.

                                                                                                  That’s my take, I’m happy to be enlightened further.

                                                                                                  I also agree that exposing these kind of structures as plain text is arguably … optimistic, and prone to edge cases. (By the way, isn’t one of the criticisms of systemd that it has an internal binary format?).

                                                                                                  [1] note I’m just going from personal observation here, it’s possible there’s a subset of Linux applications that are perfectly fine with whitespace in the executable name.

                                                                                                  1. 3

                                                                                                    I agree with most of what you just said, but I myself didn’t take “broken” to mean anything beyond “has a problem due to lack of forethought”. Maybe I’m just getting used to people exaggerating complaints (heck I’m surely guilty of it myself from time to time).

                                                                                                    It’s true that we basically never see executables with a space (or various other characters) in their names, but it can be pretty frustrating when tools stop working or don’t work properly when something slightly unusual happens. I could easily see a new-to-linux person creating just such an executable because they “didn’t know better” and suffering as a result because other programs on their system don’t correctly handle it. In the worst case, this sort of problem (though not necessarily this exact problem) can lead to security issues.

                                                                                                    Yes, it’s possible to correctly handle /proc/xxx/stat in the presence of executables with spaces in the name, but it’s almost certain that some programs are going to come into existence which don’t do so correctly. The format actually lends itself to this mistake - and that’s what’s “broken” about it. That’s my take, anyway.

                                                                                                    1. 2

                                                                                                      Thanks for this thoughtful response. I believe you and I are in agreement.

                                                                                                      Looking at this from a slightly more usual perspective, how does the Linux system handle executables with (non-whitespace) Unicode characters?

                                                                                                      1. 3

                                                                                                        Well, I’m no expert on unicode, but I believe for the most part Linux (the kernel) treats filenames as strings of bytes, not strings of characters. The difference is subtle - unless you happen to be writing text in a language that uses characters not found in the ASCII range. However, UTF-8 encoding will (I think) never cause any bytes in the ASCII range (0-127) to appear as part of a multi-byte encoded character, so you can’t get spurious spaces or newlines or other control characters even if you treat UTF-8 encoded text as ASCII. For that reason, it poses less of a problem for things like /proc/xxx/stat and the like.

                                                                                                        Of course filenames being byte sequences comes with its own set of problems, including that it’s hard to know encoding should be used to display filenames (I believe many command line tools use the locale’s default encoding, and that’s nearly always UTF-8 these days) and that a filename potentially contains an invalid encoding. Then of course there’s the fact that unicode has multiple ways of encoding the exact same text and so in theory you could get two “identical” filenames in one directory (different byte sequences, same character sequence, or at least same visible representation). Unicode seems like a big mess to me, but I guess the problem it’s trying to solve is not an easy one.

                                                                                                        (minor edit: UTF-8 doesn’t allow 0-127 as part of a multi-byte encoded character. Of course they can appear as regular characters, equivalent to the ASCII).

                                                                                                        1. 1
                                                                                                          ~ ❯ cd .local/bin
                                                                                                          ~/.l/bin ❯ cat > ą << EOF
                                                                                                          > #/usr/bin/env sh
                                                                                                          > echo ą
                                                                                                          > EOF
                                                                                                          ~/.l/bin ❯ chmod +x ą 
                                                                                                          ~/.l/bin ❯ ./ą
                                                                                                          ą
                                                                                                          
                                                                                                      2. 2

                                                                                                        If you don’t adhere to this convention, and you stumble across a consequence to this, does this mean that a format that’s been around as long as the proc system is literally broken?

                                                                                                        Yes; the proc system’s format has been broken (well, misleadingly-documented) the whole time.

                                                                                                        As you note, using pure text to represent this is a problem. I don’t recommend an internal, poorly-documented binary format either: canonical S-expressions have a textual representation but can still contain binary data:

                                                                                                        (this is a canonical s-expression)
                                                                                                        (so "is this")
                                                                                                        (and so |aXMgdGhpcw==|)
                                                                                                        

                                                                                                        An example stat might be:

                                                                                                        (stat
                                                                                                          (pid 123456)
                                                                                                          (command "evil\nls")
                                                                                                          (state running)
                                                                                                          (ppid 123455)
                                                                                                          (pgrp 6)
                                                                                                          (session 1)
                                                                                                          (tty 2 3)
                                                                                                          (flags 4567)
                                                                                                          (min-fault 16)
                                                                                                          …)
                                                                                                        

                                                                                                        Or, if you really cared about concision:

                                                                                                        (12345 "evil\nls" R 123455 6 1 16361 4567 16 …)
                                                                                                        
                                                                                                    2. 3

                                                                                                      nobody is parsing that file from the end

                                                                                                      As an example the Python Prometheus client library uses this file, and allows for this.

                                                                                                1. 3

                                                                                                  Nope. It’s 2017, it’s time to stop parsing strings with regular expressions. Use structured logging.

                                                                                                  No thanks! I’ll stick to strings.

                                                                                                  1. 2

                                                                                                    Could you please explain why? Your comment, as it is, is not bringing any value.

                                                                                                    1. 1

                                                                                                      Not the OP but here’s why I don’t like structured logging

                                                                                                      • logs will ultimately be read by humans and extra syntax gets in the way.
                                                                                                      • structured logging tends to bulk the log with too much useless information.
                                                                                                      • most of the use cases of structured logging could be better handled via instrumentation/metrics.
                                                                                                      • string based logs can be emitted by any language without dependencies so every system you manage could have compatible logging.

                                                                                                      Arguably a space separated line is a fixed-schema structured log with the least extraneous syntax possible.

                                                                                                      1. 6

                                                                                                        To me (in the same order):

                                                                                                        • logs are ultimately read by human once correctly parsed/sorted. Which means that it should be machine readable first so that it can be processed easily to create a readable message.
                                                                                                        • Too much informations is rarely a problem with logging, but not enough context is often an issue.
                                                                                                        • Probably, but structured logging still offers some simpler ways for this.
                                                                                                        • You just push the format problematic from the sender (that can use a simple format) to the receiver (that has to parse different formats according to what devs fancy)

                                                                                                        To me the best recap on why I like structured logging is: https://kartar.net/2015/12/structured-logging/

                                                                                                        1. 2

                                                                                                          most of the use cases of structured logging could be better handled via instrumentation/metrics.

                                                                                                          Speaking as a developer of Prometheus, you need both. Metrics are great for an overall view of the system and all its subsystems, but can’t tell you about every individual user request. Logs tell you about every request, but are limited in terms of understanding the broader system.

                                                                                                          I’ve wrote a longer article that touches on this at https://thenewstack.io/classes-container-monitoring/

                                                                                                    1. 4

                                                                                                      The public Prometheus/Grafana dashboard for the streaming: https://dashboard.congress.ccc.de/?refresh=5m&orgId=1

                                                                                                      1. 3

                                                                                                        I’d recommend reading My Philosophy on Alerting. Systems like Prometheus are designed to allow more sophisticated alerting, such as predicting when a disk will fill.

                                                                                                        1. 1

                                                                                                           I had to spend time justifying the presence of that ‘0’ many times over the years as other developers questioned its purpose.

                                                                                                          I don’t get why he wouldn’t just put a code comment, or a regression test that checks memory usage. Then he doesn’t need to say anything.

                                                                                                          1. 10

                                                                                                            Per the article, he did both of those.

                                                                                                            1. 3

                                                                                                              That’s what I get for skimming and commenting. I deserve to look dumb there.

                                                                                                          1. 9

                                                                                                            tl;dr sudo parses /proc, fucks up

                                                                                                            1. 18

                                                                                                              I think that sells it a little short. There’s something to be said about a system design where parsing strings in /proc is a thing, how A leads to B leads to root, etc.

                                                                                                              1. 4

                                                                                                                It also illustrates why procfs in and of itself is bad for security.

                                                                                                                1. 4

                                                                                                                  I’d take it more that hard to parse formats due to a poor choice of how to handle field separators that appears in your data lead to bugs. This particular parsing issue is one I’ve run into myself.

                                                                                                                  1. 4

                                                                                                                    Really it illustrates that plain text (byte sequences) as popularized by unix is a poor interface format. Unfortunately it continues to be popular because C’s retrograde type system and poor literal support discourage those who still write C from using anything better.

                                                                                                                    1. 3

                                                                                                                      This really has nothing to do with C, and the language blaming is unwarranted. There are perfectly good (C) APIs that do not involve parsing text, and such an API could have been used here. But some people think parsing text is still the way to go.

                                                                                                                      1. 2

                                                                                                                        For most things it’s very difficult to express a good API without sum types. In C you can’t even fake them with polymorphism and the visitor trick.

                                                                                                                    2. 2

                                                                                                                      Human readable formats vs machine readable formats, really…

                                                                                                                  2. 2

                                                                                                                    What does OpenBSD do there?

                                                                                                                    1. 2

                                                                                                                      In general, sysctl.

                                                                                                                1. 5

                                                                                                                  Something I’ve noticed is docs that are reference have a tendency over time to expand also into user guides.

                                                                                                                  I’ve never seen it work out, as the two use cases are very different.

                                                                                                                  One requires quite technical and specific information. If you mix in guides, the reader is left to carefully read the guide to see if there’s a subtlety explained therein that is relevant.

                                                                                                                  The other is more along the lines of a tutorial. Mentioning all the fine print only confuses a new user, as such details aren’t relevant to them at this stage.

                                                                                                                  1. 5

                                                                                                                    Strong agree. Long ago, I gave a talk about this: https://air.mozilla.org/rust-meetup-december-2013/

                                                                                                                    TL;DR, API docs, guides, and reference materials have three different audiences, and so need to be three different things.

                                                                                                                    1. 1

                                                                                                                      Sorry, I’m too lazy to watch the talk. I can understand the difference between API docs and guides. What makes reference materials different to API docs?

                                                                                                                  1. 2

                                                                                                                    Processing of FOSDEM videos is ongoing, about 80% are processed. 544 are currently available.