Threads for hanno

  1. 1

    This is great! Is support for PGP keys planned? Or pasting multiple keys?

    1. 3

      I originally had pgp support planned, but I’m not sure how useful that would actually be, as none of these vulns directly affect pgp (with a few exceptions, but they’re practically irrelevant).

      If you plan to check a larger number of keys I recommend using the tool instead. It’s a simple commandline tool, it shouldn’t be difficult to use.

      1. 1

        Thanks!

    1. 10

      I’ve (mercifully) never needed to deal with CVEs, but my understanding is that maintainers often dislike them because the process isn’t run by vendors/developers/maintainers but by “anyone who plugs details in the MITRE form”. After looking at the process a bit, it looks like it would be easy for me to submit a CVE for any product I wanted, give a link to a self-referential security page (“Foo has security issue bar, see CVE-XXX”), and have the same thing happen.

      Strange system.

      1. 8

        Part of the problem here is that the idea of CVEs is not aligned with what many people think CVEs are.

        Ultimately the core idea of the CVE system is just to have a common identifier when multiple people talk about the same vulnerability. In that sense having bogus CVE entries doesn’t do much harm, as it’s “just another number” and we won’t run out of numbers.

        But then some security people started treating CVEs as an achievement, aka “I did this research and I found 5 CVEs!” etc. - and suddenly you have the expectation of the CVE system being a gatekeeper of what vulnerability is “real” and what not. (And having seen a lot of these discussions, they are imho a waste of energy, you’ll have cornercases of the form: might be a vuln, might help if you have another vuln to chain together, whether to call it a vuln people will never agree.)

        1. 2

          But then some security people started treating CVEs as an achievement, aka “I did this research and I found 5 CVEs!” etc. - and suddenly you have the expectation of the CVE system being a gatekeeper of what vulnerability is “real” and what not.

          That is a maddening behavior that I’ve also observed. It’s also a hard thing to fix, considering that the CVE database was conceived in the face of vendors who refused to admit that they shipped security issues. We needed a common way to reference them even if the vendor disagreed.

          I’m not sure how I think we should fix it, yet.

        2. 3

          The fact that you can is a strong checks-and-balance method of making sure that a maintainer cannot stonewall an actual vulnerability and pretend it doesn’t exist. The process is not without flaws, but it’s the devil we know and being an honors system it works surprisingly well. Speaking as a member of a security team in a well known project, the CVE part of the security process is among the least complicated aspects.

          1. 2

            It’s even weirder. I needed a CVE once for a library I maintain, and I couldn’t get one! Apparently ranges of numbers are allocated to certain organizations (big corporations and Linux distros), and you’re supposed to ask “your” organization to give you a number. I was independent, and nobody wanted to talk to me.

          1. 8

            There’s an unintended security sideeffect of these autosave features that one should be aware of:

            In web environments it’s often common that files with a certain extension get executed by an interpreter (e.g. index.php), while “unknown” extensions will just get shipped to the user. There might be secrets (think wp-config.php) in those files that now become accessible for an attacker.

            kate likely isn’t widely used on web servers, but a) may still be used on a desktop with later uploads and b) vim has very similar functionality. Particularly with vim this is a super-common vulnerability to have https://[host]/.wp-config.php.swp downloadable.

            1. 4

              True, but it’s easily fixed:

                  set directory=~/.vim/tmp/
                  set backupdir=~/.vim/tmp/
                  set undodir=~/.vim/tmp/
              
              1. 2

                This is more or less what I do (though I put mine inside ~/.cache). I mostly did this because I got tired of adding the .swp things to various ignore lists for syncing / backup tools but it also means that they don’t get copied to random locations and if I rsync a directory while a file is open then vim doesn’t permanently think it’s open on the remote machine.

            1. 2

              Surely more than just the Fujifilm printers use the Basic Crypto Module of the Safezone library by Rambus. Can we expect more CVEs to come out against software that uses this library?

              1. 4

                Well, I haven’t seen these keys, and I have checked a lot of keys. So I can safely say that if there are other products using that library then either they’re not using it for key generation or they’re not very widely used or they’re not doing TLS or SSH.

                (And for the CVEs: No, I discussed that with Mitre and their current policy is that when multiple products use the same library that’s one CVE for all of them.)

                1. 2

                  I recall a study a “few” years ago (with my current understanding of the passage of time it could easily be a decade :D)

                  They basically got a large DB of public keys and just computed GCD of all of them, IIRC they found a few pairs. Could be worth someone doing again now (especially with the public CT logs providing a record of large numbers of RSA keys)

                  1. 6

                    You mean https://factorable.net/

                    I have actually been trying to run this against CT log data, the problem is CT is really big and it goes beyond what this algorithm can practically handle.

                    1. 1

                      How big is really big?

                      1. 3

                        More than 6 billion certs. Just lookup a recent cert on crt.sh and check the id.

                        (Though this includes certs with duplicate keys - every cert these days has a precert - and non-RSA keys, but still - it’s a lot of keys.)

                    2. 2

                      Tangentially, one thing that does happen is VERY easily factorised public keys being uploaded to key servers as a result of bad RAM. If you’re storing a public key in RAM which introduces a bit flip due to a fault, it’s overwhelmingly likely that the bit flip will produce a number that is highly composite and easy to factor. The majority of really big numbers are highly composite, so a random bit flip from a good key almost always lands on a number that is highly composite.

                      1. 1

                        They basically got a large DB of public keys and just computed GCD of all of them, IIRC they found a few pairs. Could be worth someone doing again now (especially with the public CT logs providing a record of large numbers of RSA keys)

                        This concept is absolutely fascinating to me. I never thought of this being used/abused in this way!

                        1. 1

                          I remember hearing about this study as a grad student, so it has indeed been at least a decade :P

                          IIRC the effected devices were mostly embedded devices e.g. home routers and such, and it was supposed that there was an underlying problem with the randomness on those machines.

                          1. 4

                            Whenever something like this comes up it is always some network attached thing that sold in huge quantities (with reseller branding), no mechanism to disable the bad logic, and no ability to update the software even if the owners were aware of the problem.

                            Womp womp :)

                      2. 2

                        They also need someone to publish the keys somewhere, which is presumably the problem

                      1. 1

                        I literally don’t see a problem with this proposal, the certificates they’re talking about seem to be at least as secure as the certificates we use now. And I would love if browsers could improve security somewhat by warning me if I for example enter a credit card number on a site without such a QWAC certificate, same way as http:// sites are marked when you try to enter a password.

                        Only critisism I have is the name. QWAC sounds like quack, as in charlatan, expect puns from the security community in 3.. 2.. 1..

                        1. 16

                          Not really, some of the CAs failed to pass basic sanity checks by Mozilla to be included in the past.

                          There should have been a discussion with the browser vendors and they should have been much more involved.

                          Frankly, EU should just fund Firefox development directly for the geopolitical reasons and in turn, Mozilla should get involved in EU security.

                          1. 9

                            Strong agree that EU should fund Firefox development, I’d take it one step further and say that Mozilla should be allowed to function on EU money alone; their stated goals seem very compatible, and it would allow Mozilla to get out from Google’s grasp, and allow the EU less dependence on American tech.

                            But I think this is also why we should be supportive about an initiative like this; instead of saying “EU is coming with stupid plans to make our computer insecure”, we should say “Fantastic you want to look into this, did you know there’s an American nonprofit that already does this, they have a lot of experience with it already, and they could really use some funding, you should talk!”

                          2. 16

                            I literally don’t see a problem with this proposal, the certificates they’re talking about seem to be at least as secure as the certificates we use now.

                            Erh… no. That’s literally half of the article.

                            Browser vendors have minimum security requirements for CAs. Issuers for QWACs would not have to follow those, but much weaker requirements. There’s even a specific example where exactly that happened.

                            1. 4

                              QWAC is the European equivalent of EV certs and has all of the same problems. EV demonstrated very clearly that such notices do not work, and actively undermine security.

                              They do however make certificates more expensive (no chance you’re getting a free one), and break automation (can’t issue without real authentication of identity), both running counter to our established understanding of how to ensure a secure PKI.

                              You’re also assuming the CAs don’t mis-issue certificates but browser would be required to accept certificates from CAs that fail even the most basic sanity checks, let alone the full security and correctness requirements of the existing root stores.

                              1. 4

                                I think the idea behind QWAC, that every cert is tied to a legal entity is not necessarily a terrible one. The implementation however, well obviously they screwed that up, by allowing companies like Camerfirma who can’t figure out basic TLS certs. They are literally solving the symptom and not the actual problem.

                                Perhaps the current govt business licensing process could be amended to also generate a way that then current CA’s can link a business license to a cert. Obviously that’s a big complicated mess, since business licenses are distributed across cities and counties in the USA at least, though I imagine the EU and others operate similarly.

                                But that is the only way to do this properly, If we really want a strong link between entities and TLS certs, the existing business licensing process needs to get amended to include domains and TLS certs. Perhaps the laziest implementation would just be amending the business license application form(s) to include domain names along with phone #’s and addresses.

                                1. 7

                                  It is a terrible idea. It has literally already existed. It was called Extended Validation.

                                  Things it did: made certificates much more expensive (>$100, initially >$1000) making people want longer loved certs, the identity authentication breaks automation, making it harder to automate rolling, further complicating renewal. So EV certs cause problems for actually maintaining a sites PKI.

                                  For the benefits: none. I mean that seriously, browsers did multiple studies and just like the old padlock they found that a positive indicator carries negligible information as it is generally ignored. It gets even worse though: a company’s public name does not necessarily match their legal name, which confuses users so they don’t trust legitimate sites. Then for a double whammy: legal names are not unique, so you can make a fraudulent site with the EV (or QWAC this time round) have a “legitimate” name on any url you like.

                                  That’s why browsers dropped EV certs: they make it harder to run secure sites well, and at best they don’t do anything to help make users safer, and at worse confuse and mislead them.

                                  1. 1

                                    QWAC and EV are both stupid, I’m 100% with you here. Nobody thought about the problem very hard coming up with these solutions, as they both suck.

                                    The idea of having some way to know that example.com is tied to business entity Example, Inc in this jurisdiction is not a bad idea. Of course there can be 500 Example Inc’s, but there can only be 1 Example inc in Olliejville, NC USA. i.e. local jurisdictions already know how to distinguish different businesses within their control.

                                    Most jurisdictions require a license to do business in Olliejville. If every city, county, state, etc just added a domains field to their applications and forms, we could then create larger indexes easily enough and have this information. That’s enough. Governments already know how to handle business licenses and having them add 1 more field is not a big deal. Of course it’s not perfect and it would take considerably longer, but it’s arguably a way better solution than QWAC or EV. Centralizing this is mostly idiotic. Once this is deployed to some reasonable amount of areas and indexes get made, perhaps it makes sense for CA’s to create TLS certs with this information, basically filling out the city, town and name fields for the domain in question for you. No verification needed, they just need to trust local govt X and Y to get the information correct.

                                    Note, this “solution” I just created off the top of my head, I’m sure better ones exist if someone thought about it harder than I did.

                                    The idea of having a legal entitity <-> domain is not a bad one. Obviously our past and proposed implementations where some crappy company is supposed to verify all this is idiotic, they have ZERO incentive to get any of this correct and will just do the bare minimum until it’s useless information, just like EV certs were. I agree with you there. Local Govts are not in the same boat, they have incentives to get the data right.

                                    1. 3

                                      Your legal entitity <-> domain mapping cannot be done in a way that helps users.

                                      The researcher who got an EV cert for their local stripe, inc could have turned around and got strípe.com, and a local government’s records will be able to say stripe, inc <-> strípe.com. A user will see strípe.com and the big “you can trust this” UI the browser is forced to show will even say “the government agrees, this site definitely belongs to Stripe, inc”.

                                      Similarly if a user is shown a company name “Foo” on bar.com, what are they meant to think? It is exceedingly common for the legal name of a company to be completely different from the brand name. So users are have to decide which one to trust.

                                      The only field that actually demonstrate the identity of a website is the url. Anything else is irrelevant.

                                      There is nothing you can add to a certificate, and no field you can require the UI to display, that is more trustworthy than the domain.

                                      It isn’t even a matter of local governments having a interest in keeping those records accurate. A local shop selling painting supplies called Stripe, can register their domain as being str1pe.com, and should be able to get the magic certificate flag. I’m going to go out on a limb and say that their security doesn’t match the security of the american Stripe, Inc

                                      You’re also making an assumption about what the “best interests” of such an organization is: Plenty of counties, states, or even countries, make significant income from business registrations, for companies that do not in any meaningful sense exist in those locations. Their interest is in having companies be registered, and making that as painfree for the companies as possible, that means accepting the urls they use.

                                      You’re assuming that these CAs are really doing any serious validation, which even when EV was restricted to the high end CAs they were not, and were happily issuing EV certs with obvious incorrect information (https://scotthelme.co.uk/extended-validation-not-so-extended/).

                                      The find question is what happens when a CA encounters an EV cert for a company whose name matches another more “famous”/important company? Because past research says they’ll blindly revoke the 100% compliant, correct, and accurate certificate and keep the money.

                                      There is no certificate identity <-> domain mapping that adds security, as the only thing that matters is the url.

                                      1. 1

                                        In every single comment I have said EV is bad, yet you haven’t seemed to grasp that I said that. let’s try again: EV CERTS ARE STUPID. Can we move on from all of that now? You seem to have not listened at all to what I said. Stop thinking about browser UI or web security, that’s not remotely my point.

                                        It should be relatively easy to track down out in the physical real world the person/persons responsible for example.com. This really only matters when an entity is doing business via example.com, so it really only needs to apply to business entities. Hence adding a domain field to existing business licenses solves the problem. It’s the same as a telephone # or an address.

                                        1. 1

                                          Ok, I’ve re-read, so I want to clarify. Are you saying it is reasonable for a local government’s company listings to include a url? e.g identity->url (I honestly assumed most would have that now in contact info sections, but governments are slow), in that case I agree it seems useful.

                                          You use <-> which I assumed meant the cert would also have a legal name style entry that was somehow “special” and get UI treatment that would make it seem trustworthy (vs the already present subject organization name, which is intentionally not distinguished from any other field in most cert viewers)

                                          1. 1

                                            Are you saying it is reasonable for a local government’s company listings to include a url?

                                            Yes. I can’t say with any certainty about most business license forms, but all the ones I’ve ever filled out have never asked for this information. I’ve occasionally seen an email address field though. :)

                                            You use <-> which I assumed meant the cert would also have a legal name style entry that was somehow “special” and get UI treatment that would make it seem trustworthy (vs the already present subject organization name, which is intentionally not distinguished from any other field in most cert viewers)

                                            No, we already know this is stupid and would be a terrible way to do it.

                                            If one wanted to do something like this, arguably a better way to do this would have the local city/etc govt cross-sign an existing TLS cert(say from lets encrypt) with their own, saying, we attest(sign) that this cert belongs to this company. This can all be done with ACME (I’m pretty sure, it’s been a while since I’ve read the spec, but I think I remember it’s fine) in an automated fashion so it’s not a big deal to add to the existing workflow. This doesn’t change the security at all, and doesn’t require any UI changes.

                                            1. 1

                                              Ok, so we do agree - I just misinterpreted <-> as meaning you wanted a bidirectional relationship :)

                                              1. 1

                                                Well I do, but it can require some work, again the point isn’t that it be all up in your face, the point is, the mapping exists, so if one needs the mapping for some purpose(law enforcement, or research or whatever), it can be done reasonably. If for some reason their exists a valid use-case to make it easy and all up in your face, like a WEB UI change, it could be added eventually, but that use-case is far from certain or clear at this point in time. We know from the EV debacle that it’s probably a disaster to just assume it’s useful from day one. I for one am not advocating for any UI changes.

                              1. 2

                                Took me a few days to find the time to read the post.

                                I feel like I’d remove quite a bit more flexibility. E.g. all the discussions about e - I simply think this shouldn’t be part of the key, but fixed. Almost every standard in the past year settled on e=65537, so I think if you reimplement anything in RSA, your code should be “if (e!=65537) fail;”.

                                Further you’re kinda doing that, but I’d emphasize that one shouldn’t allow arbitrary keysizes and it’s adding a whole lot of complexity to RSA that this was ever allowed. Let’s Encrypt started disallowing anything !={2048,3072,4096} for interesting reasons (makes it easier to block Debian weak keys).

                                1. 1

                                  Good points all around :)

                                1. 1

                                  It should be noted that this is an old blogpost and this no longer works with latest glibc. Still interesting though.

                                  1. 1

                                    If I’m reading this right, the oft-repeated nightmare that Quantum computers would render crypto ineffective was, possibly a little overstated?

                                    1. 6

                                      I’m not sure this was ever really a question.

                                      It was always clear that Shor’s algorithm only applied to very specific problems. Unfortunately it turned out these were the exact problems that were used in pretty much all mainstream public key cryptography. But there always were alternatives.

                                      One likely quantum safe cryptosystem is McEliece, which was developed in the 70s. It is not very practical due to very large keys, so it’s likely not gonna be the one that your future browser will use.

                                      1. 2

                                        Wikipedia has a good summary. As someone with very, very limited understanding of the mathematics of cryptography, I take the tldr; to be: current symmetric encryption and hash algorithms are probably fine, but will have to double their key size; current public-key algorithms are broken, but there are replacements waiting in the wings.

                                      1. 5

                                        Reading this so many years later it’s kinda remarkable, particularly as this was six years before the Snowden leaks.

                                        The Dual EC scandal only really got larger attention after Snowden and many more details around deployment became known (particularly the RSA Inc and and Juniper stories). But as this post shows the key facts around the broken/backdoored cryptography where known and publicly discussed long before that.

                                        1. 3

                                          What I find interesting is that a lot of the criticism seems to be underspecification and variations of the format.

                                          It’s obvious that this is a problem, but is it an unfixable one? It seems there is an older RFC, but that’s still underspecified. It also seems to me a way forward to improve things would be:

                                          a) Write up a new RFC and specify all the inconsistencies to something sane, ideally something that already a large number of applications do so, and try to get as many of them on board promising to support it.

                                          b) Name that something easy to remember, like “CSV, 2021 spec” or something.

                                          c) All applications provide at least an option to use “CSV, 2021 spec” and ideally move to that being the default.

                                          Please note that this wouldn’t mean “CSV, 2021 spec” can only be used to exchange data with other applications supporting it. Given we try to use what’s already what most applications do, unless you have weird edge cases it probably already works in most cases with existing CSV-supporting applications.

                                          FWIW I think there’s simliar inconsistencies in JSON parsers, and probably pretty much the same should be done.

                                          1. 4

                                            This reminds me of https://xkcd.com/927/ :)

                                            I think you underestimate how many non-technical people produce datasets. None of these people will ever have heard of RFCs or whatever else TLA you may say to them.

                                            1. 1

                                              I know that comic, but I think I made clear that I absolutely did not want to do this.

                                              My proposal would be to spec what is as close to existing solutions as possible and would likely work in most situations right from the start.

                                              1. 1

                                                I understand, but I would not even know where such a spec would start and where it would end. Would it include date-times? How about timezone offsets? What about number formatting? Character encodings? This all gets very complicated very quickly.

                                            2. 2

                                              The primary reason why CSV is such an ubiquitous format is that anyone can understand it quickly, and therefore nobody is coding against any spec. The RFC that exists was merely retrospective.

                                              The very thing that makes CSV so common is also the reason why drafting a new spec will be unlikely to gain traction.

                                              1. 1

                                                Nobody has stepped up to the plate so far. Feel free to start! Note, that I’m not volunteering, but I encourage you to!

                                              1. 3

                                                Repology is a way to check a bunch of Linux distributions’ version of glibc included in their respective repositories: https://repology.org/project/glibc/versions

                                                There doesn’t seem to be a single major distro that’s upgraded to 2.34 yet in a stable release. It’s hard to rapidly release such an integral library, so we might be waiting a while before the rebuilds are finished everywhere.

                                                1. 4

                                                  This is not how distros work, at least most of them.

                                                  They usually ship the version of a library that was stable when they made their last stable release and then backport important fixes.

                                                1. 0

                                                  Yes STARTTLS can be downgraded, but how do most clients react when you simply block ports 465, 993 and 995? The client will often try to connect on port 25, 110 or 143 instead.

                                                  Those might be closed server-side since you don’t want to allow clients to connect insecurely, but a hacker that was determined enough to inject data in a STARTTLS session can just as easy set up an stunnel MITM on the insecure ports.

                                                  1. 1

                                                    I haven’t seen such behavior during our tests and I would definitely consider it a security vulnerability.

                                                    Can you name a specific client that will connect through plaintext ports if TLS ports are blocked?

                                                  1. 11

                                                    STARTTLS always struck me as a terrible idea. TLS everywhere should be the goal. Great work.

                                                    1. 6

                                                      Perhaps this is partially the result of a new generation of security researchers gaining prominence, but progressive insight from the infosec industry has produced a lot of U-turns. STARTTLS was obviously the way forward, until it wasn’t, and now it’s always been a stupid idea. Never roll your own crypto, use reliable implementations like OpenSSL! Oh wait, it turns out OpenSSL is a train wreck, ha ha why did people ever use this crap?

                                                      As someone who is not in the infosec community but needs to take their advice seriously, it makes me a bit more wary about these kinds of edicts.

                                                      Getting rid of STARTTLS will be a multi-year project for some ISPs, first fixing all clients until they push implicit TLS (and handle the case when a server doesn’t offer implicit TLS yet), then moving all the email servers forward.

                                                      Introducing STARTTLS had no big up-front costs …

                                                      1. 9

                                                        Regarding OpenSSL I think you got some bad messaging. The message is not “don’t use OpenSSL”. The real message was “all crypto libraries are train wrecks and need more funding and security auditing”. But luckily OpenSSL has improved a lot, and you should still use a well-tested implementation and not roll your own crypto and OpenSSL is not the worst choice.

                                                        Regarding STARTTLS I think what we’re seeing here is that there was a time when crypto standards valued flexibility over everything else. We also see this in TLS itself where TLS 1.2 was like “we offer the insecure option and the secure option, you choose”, while TLS 1.3 was all about “we’re gonna remove the insecure options”. The idea that has gained a lot of traction is that complexity breeds insecurity and should be avoided, but that wasn’t a popular idea 20-30 years ago when many of these standards were written.

                                                        1. 2

                                                          The message is not “don’t use OpenSSL”. The real message was “all crypto libraries are train wrecks and need more funding and security auditing”. But luckily OpenSSL has improved a lot, and you should still use a well-tested implementation and not roll your own crypto and OpenSSL is not the worst choice.

                                                          100%

                                                          I prefer libsodium over OpenSSL where possible, but some organizations can only use NIST-approved algos.

                                                      2. 3

                                                        Agreed. It always felt like a band-aid as opposed to a well thought out option. Good stuff @hanno.

                                                        1. 3

                                                          Your hindsight may be 20/20 but STARTTLS was born in an era where almost nothing on the Internet was encrypted. At that time, 99% of websites only used HTTPS on pages that accepted credit card numbers. (It was considered not worth the administrative and computing burden to encrypt a whole site that was open to the public to view anyway.)

                                                          STARTTLS was a clever hack to allow opportunistic encryption of mail over the wire. When it was introduced, getting the various implementations and deployments of SMTP servers (either open source or commercial) even to work together in an RFC-compliant manner was an uphill battle on its own. STARTTLS allowed mail administrators to encrypt the SMTP exchange where they could while (mostly) not breaking existing clients and servers, nor requiring the coordination of large ISPs and universities around the world to upgrade their systems and open new firewall ports.

                                                          Some encryption was better than no encryption, and that’s still true today.

                                                          That being said, I run my own mail server and I only allow users to send outgoing mail on port 465 (TLS). But for mail coming in from the Internet, I still have to allow plaintext SMTP (and hence STARTTLS support) on port 25 or my users and I would miss a lot a messages. I look forward to the day that I can shut off port 25 altogether, if it ever happens.

                                                          1. 2

                                                            Your hindsight may be 20/20 but STARTTLS was born in an era where almost nothing on the Internet was encrypted.

                                                            I largely got involved with computer security/cryptography in the late 2000’s, when we suspected a lot of the things Snowden revealed to be true, so “encrypt every packet securely” was my guiding principle. I recognize that wasn’t always a goal for the early Internet, but I was too young to be heavily involved then.

                                                            Some encryption was better than no encryption, and that’s still true today.

                                                            Defense against passive attackers have value, but in the face of active attackers, opportunistic encryption is merely security theater.

                                                            I look forward to the day that I can shut off port 25 altogether, if it ever happens.

                                                            Hear hear!

                                                            1. 2

                                                              Defense against passive attackers have value, but in the face of active attackers, opportunistic encryption is merely security theater.

                                                              That’s not quite true, it still provides an audit trail. The goal of STARTTLS, as I understand it, is to avoid trying to connect to a TLS port, potentially having to wait for some arbitrary timeout if a firewall somewhere is set to drop packets rather than reject connections, and then retry on the unencrypted path. Instead, you connect to the port that you know will be there and then try to do the encryption. At this point, a passive attacker can’t do anything, an active attacker can strip out the server’s notification that STARTTLS is available and leave the connection in plaintext mode. This kind of injection is tamper-evident. The sender (at least for mail servers doing relaying) will typically log whether a particular message was sent with or without STARTTLS. This logging lets you detect which messages were potentially leaked / tampered with at a later date. You can also often enforce policies that say things like ‘if STARTTLS has ever been supported by this server, refuse if it isn’t this time’.

                                                              Now that TLS support is pretty-much table stakes, it is probably worth revisiting this and defaulting to connecting on the TLS port. This is especially true now that most mail servers use some kind of asynchronous programming model so trying to connect on port 465 and waiting for a timeout doesn’t tie up too many resources. It’s not clear what the failure mode should do though. If an attacker can tamper with port 25 traffic, they can also trivially drop everything destined for port 465, so trying 465 and retrying on 25 if that fails is no better than STARTTLS (actually worse - rewriting packets is harder than dropping packets, one can be done by inspecting the header the other requires deep-packet inspection). Is there a DNS record that can tell connecting mail servers to not try port 25? Just turning off port 25 doesn’t help because an attacker doing DPI can intercept packets for port 25 and forward them over a TLS connection that it establishes to 465.

                                                        1. 22

                                                          Does anyone, anywhere ever get taught how to design a file format? It seems a giant blind spot that people seldom talk about, unless like this person they end up needing to parse or emit a particularly hairy one.

                                                          A while ago I discovered RIFF and was just like “why are we not using this everywhere?”

                                                          1. 19

                                                            In university it came as a natural side effect of OS Design (file-systems and IPC) and network communication (device independent exchange). It’s enough for a start and then go by cautionary tales and try to figure out which ones apply in your particular context. You’ll be hard pressed to find universal truths here to be ‘taught’. Overgeneralise and you create a poor file-system (zip), overspecialise and it’s not format, it’s code.

                                                            The latter might be a bit surprising, but take ZSTD in dictionary mode. You can noticeably increase information density by training it on case specific data. The resulting dictionary need to go somewhere as it is not necessarily a part of the bitstream. The decoding stage needs to know about the dictionary. Do you preshare it and embed it in the application or put it as part of the bitstream. Both can be argued, both have far reaching consequences.

                                                            The master-level for file formats, if you need something to study, I’d say is media container formats e.g. MKV. You have multiple data streams of different sizes, some are relevant and some are to be skipped and it is the consumer that decides. Seeking is often important and the reference frames may be at highly variable offsets. There are streaming / timing components as your spinning disk media with a 30Gb file has considerable seek times and rarely enough bandwidth and caches. They are shared in contexts that easily introduces partial corruption that accumulates over time, a bit flip in a subtitle stream shouldn’t make the entire file unplayable and so on.

                                                            RIFF as an example is a TLV (tag-length-value). These are fraught with dangers. It is also the one that everyone comes up with and it has many many names. I won’t spoil or go into it all here, part of the value is the journey. Follow RIFF to EXIF to OMP and see how the rationale expands to “make sense” when you suddenly have a Jpeg image with a Base64 encoded XML indexed Jpeg image inside of it as part of metadata block. Look at presentations by Ange Albertini ( e.g. Funky File Formats: https://www.youtube.com/watch?v=hdCs6bPM4is ), Meredith Patterson ( Science of Insecurity: https://www.youtube.com/watch?v=3kEfedtQVOY ) and Travis Godspeed ( Packets in Packets: https://www.youtube.com/watch?v=euMHlV6MNqs).

                                                            1. 18

                                                              Being a self-taught programmer, I think the study of file formats is underrated. I only learned file format parsing to help me write post-exploitation malware that targets the ELF file format. I also used to hack on ClamAV for a job, and there I learned better how to parse arbitrary file formats in a defensive way–such that malware cannot target the AV itself.

                                                              I’m in the process of writing a proprietary file format right this very moment for ${DAYJOB}. The prior version of the file format was incredibly poorly designed, rigid, and impossible to extend in the future. I’m grateful for the lessons ELF and ClamAV taught me, otherwise I’d likely end up making the same mistakes.

                                                              1. 15

                                                                There’s a field of IT security called “langsec” that’s basically trying to tell people how to design file formats that are easier to write secure parsers for. But it’s not widely known and as far as I can tell usually not considered when designing new formats.

                                                                I think this talk gives a good introduction: https://www.youtube.com/watch?v=3kEfedtQVOY

                                                                1. 10

                                                                  The laziest answer is don’t bother and let Sqlite be your on-disk file format. Then you also get interop with any geek wanting to mess about with your data, basically free.

                                                                  It’s certainly not ideal in some situations, but it’s probably a good sane default for most situations.

                                                                  sqlite links about it: https://sqlite.org/affcase1.html and https://sqlite.org/fasterthanfs.html

                                                                  That said, I agree it would be great to have nice docs about various tradeoffs in designing file formats. So far the best we seem to have are gotcha posts like this one.

                                                                  1. 3

                                                                    Or CBOR, flatbuffers/capnproto/etc, just any existing solid serialization format. If storing just “regular” data. Things like multimedia come with special requirements that might make reusing these formats difficult.

                                                                    1. 2

                                                                      There are three use cases that make designing a file format difficult:

                                                                      • Save on one platform / architecture, load on another (portability).
                                                                      • Save on one version of your program, load on a newer one (backwards compatibility).
                                                                      • Save on one version of your program, load and modify on an older one (forwards compatibility).

                                                                      Of these, SQLite completely fixes the portability problem by defining platform and architecture-agnostic data types. It transforms the other two from file format design problems into schema design problems. Backwards compatibility is fairly simple to preserve in both cases, read the file / database and write out the new version. It may be slightly easier to provide a schema migration query in SQLite than maintain the old reader and the new writer for a custom file format, but you’re also likely to end up with a more complex schema for a SQlite-based format than something custom. It can help a bit with forwards compatibility. This is normally implemented in custom formats by storing the version of the creator and requiring unknown record types to be preserved so that a new version of the program can detect a file that contains records saved by a program that didn’t know what they meant and fix up any changes. It may be possible for foreign key constraints and similar in SQLite to avoid some of this but it remains a non-trivial problem.

                                                                    2. 10

                                                                      Excellent point — same thing goes for network protocols, though they’re less common.

                                                                      I learned a lot from RFC 3117, “On The Design Of Application Protocols” when I read it about 20 years ago. It’s part of the specs for an obsolete protocol called BEEP, but it’s pretty high level and goes into depth on topics like how to frame variable length records, which is relevant to file formats as well. Overall it’s one of the best-written RFCs I’ve seen, and I highly recommend it.

                                                                      1. 5

                                                                        IFF, the inspiration for RIFF, was used everywhere on the Amiga, more or less.

                                                                        1. 2

                                                                          Its ubiquity also had the advantage that you could open iffparse.library to walk through any IFF-based format instead of writing your own (buggy) format parser.

                                                                        2. 4

                                                                          I had the same question when I’ve learned about the structure of ASN.1, which is probably only used to store cryptographic data in certificates (maybe there are some other uses, but I haven’t seen any), but probably can be used anywhere really (it’s also a TLV structure).

                                                                          1. 7

                                                                            ASN.1 is used heavily in telecommunications and is used for SNMP and LDAP. Having implemented more of SNMP than I care to remember and worked with some low level telecoms stuff, ASN.1 gives me nightmares. I know the reasons for it but it’s definitely more complicated than it seems…

                                                                          2. 3

                                                                            I don’t think “how to design a file format” is often taught but I’ve been taught many examples of file and packet formats with critiques about what parts were good or bad.

                                                                            RIFF itself may not be common but its ideas are; PNG most notably. Also BEEP/BXXP, a now dead 2000-era packet format. But these days human readable delimited formats like JSON and XML are more in fashion.

                                                                            The reality is no product succeeds or fails on the quality of its data formats. Their fate is determined by other forces and then whatever formats they use are what we are stuck wtih.

                                                                          1. 3

                                                                            I wouldn’t say public CDNs are completely obsolete. What this article does not take into consideration is the positive impact of geographic locality (i.e. reduced RTT and packet loss probability) on transport layer performance. If you want to avoid page load times on the order of seconds (e.g. several MB worth of javascript over a transatlantic connection) either rely on a public CDN or run your own content delivery on EC2 et al. Of course this involves more work and potentially money.

                                                                            1. 2

                                                                              This would only apply if whatever you’re fetching from the CDN is really huge. For any reasonably small file the transport performance is irrelevant compared to the extra handshake overhead.

                                                                              1. 1

                                                                                It does apply for smallish file sizes (on the order of a few megabytes). It mainly depends on how far you have progressed the congestion window of the connection. Even with an initial window of 10 MSS it would take several RTT to transfer the first megabyte

                                                                                1. 3

                                                                                  There’s a benefit if you use a single CDN for everything, but if you add a CDN only for some URLs, it’s most likely to be net negative.

                                                                                  Even though CDNs have low latency, connecting to a CDN in addition to the other host only adds more latency, never decreases it.

                                                                                  It’s unlikely to help with download speed either. When you host your main site off-CDN, then users will pay the cost of TCP slow start anyway. Subsequent requests will have an already warmed-up connection to use, and just going with it is likely to be faster than setting up a brand new connection and suffering TCP slow start all over again from a CDN.

                                                                                  1. 1

                                                                                    That is definitely interesting. I never realized how expensive TLS handshakes really are. I’ve always assumed that the number of RTTs required for the crytpo handshake are what the issue is, not the computational part.

                                                                                    I wonder if this is going to change with QUICs ability to perform 0-RTT connection setups.

                                                                                    1. 1

                                                                                      No, CPU cost of TLS is not that big. For clients the cost is mainly in roundtrips for DNS, TCP/IP handshake and TLS handshake, and then TCP starting with a small window size.

                                                                                      Secondary problem is that HTTP/2 prioritization works only within a single connection, so when you mix 3rd party domains you don’t have much control over which resources are going to load first.

                                                                                      QUIC 0-RTT may to help indeed, reducing the additional cost to just an extra DNS lookup. It won’t solve the prioritization problem.

                                                                            1. 1

                                                                              http://de interestingly ends up on some advertising page. How does one take over such a page?

                                                                              1. 3

                                                                                It doesn’t resolve for me. Maybe you have some catchall DNS?

                                                                                1. 1

                                                                                  Have you tried to add a dot ?

                                                                                  1. 5

                                                                                    There is no A record for de. You are behind a split-horizon resolver, most likely your ISP’s to inject advertisements.

                                                                                    1. 2

                                                                                      Creepy. That would explain why I got different results using my cell network.

                                                                                2. 1

                                                                                  no luck for me either. I wonder if there’s something funky with my DNS. dig A de spits out

                                                                                  de.			7155	IN	SOA	f.nic.de. its.denic.de. 1623166346 7200 7200 3600000 7200
                                                                                  
                                                                                  1. 3

                                                                                    That’s just the authority record (see the SOA instead of A) telling you which nameserver is authoritative for the query. There aren’t any A records listed on de as far as I can see.

                                                                                1. 2

                                                                                  Couple of notes:

                                                                                  If this is ongoing and not broadly patched yet, is it responsible to reveal this much detail about the vulnerabilities? (Cynical take: iot are generally insecure as hell anyway, the knowledge that these vulnerabilities exist doesn’t make much difference to an attacker.)

                                                                                  Why the hell doesn’t calloc perform an overflow check? It literally has two jobs…

                                                                                  (For that matter, why aren’t they showing calloc’s original source code?)

                                                                                  1. 2

                                                                                    IoT devices are never broadly patched. That’s… part of the business model.

                                                                                    There’s little a responsible security researcher can do here.

                                                                                    1. 1

                                                                                      “Never” is an exaggeration. I own multiple IoT devices that receive firmware updates, such as WeMo smart outlets and Ecobee thermostats. (And my Eero router, if you count routers as IoT.)

                                                                                      I’m aware of one RTOS vendor (forgotten the name) whose top selling point is its superior support for secure OTA firmware updates.

                                                                                    2. 1

                                                                                      Haven’t full read the article, so maybe I have missed some information.

                                                                                      As far as I see there are patches available for the affected vendors and some of them already have patches for there devices. At this point there is no reason to hide the details of the vulnerabilities. There are tools for analyzing binary patches and find the bugs. So it’s quiet easy to understand the bugs and create exploits for other devices.

                                                                                      On the defending side there is most of the time few or less information about the security issues. This brings problems about planing and executing an update (if available). Disable every device which is affected till it’s updated might be a good idea from a security perspective, but your management will not like it. So you have to do a risk management. Therefor more information about the bugs are better.

                                                                                    1. 3

                                                                                      I think you’re looking for Signed HTTP Exchanges: https://developers.google.com/web/updates/2018/11/signed-exchanges

                                                                                      But it’s controversial, Mozilla had some reservations. I haven’t digged into the details of that discussion though.

                                                                                      1. 4

                                                                                        Signed HTTP Exchanges (SHE) is much, much more. My understanding is that this is an intent to bake something like AMP into web standards. What I find most worrisome here is that it allows to make an origin act on behalf of another origin without the browser ever checking the actual source. Essentially, this means amp.foo.example could - for all intents and purposes of web security and the same origin policy - speak for my-totally-other-site.example. This also removes the confidentiality you could have with the origin server and inserts a middle-man, which you wouldn’t have if you talked to the origin server directly. Mozilla openly considers Signed HTTP Exhanges as harmful.

                                                                                        That being said, a solution for bundling that supports integrity and versioned assets would be very much welcomed though!

                                                                                      1. 3

                                                                                        Attribution is trivial and left as an exercise to the reader.

                                                                                        Now I’m curious. Can you share the binary if you’re leaving that as an exercise to the reader?

                                                                                        1. 1

                                                                                          I think that was a joke. Attribution is usually the exact opposite of trivial.

                                                                                          1. 6

                                                                                            Apparently no. Over at the orange site: https://news.ycombinator.com/item?id=26302565

                                                                                            tl;dr it looks pretty much like an exploit from Immunity.

                                                                                            1. 1

                                                                                              Yeah, I thought maybe the author was saying that someone was claiming credit if you looked at the strings, and for some reason didn’t want to call attention to whom.

                                                                                              It works as a joke, too.