1. 9

Abstract: “X.509 certificate parsing and validation is a critical task which has shown consistent lack of effectiveness, with practical attacks being reported with a steady rate during the last 10 years. In this work we analyze the X.509 standard and provide a grammar description of it amenable to the automated generation of a parser with strong termination guarantees, providing unambiguous input parsing. We report the results of analyzing a 11M X.509 certificate dump of the HTTPS servers running on the entire IPv4 space, showing that 21.5% of the certificates in use are syntactically invalid. We compare the results of our parsing against 7 widely used TLS libraries showing that 631k to 1,156k syntactically incorrect certificates are deemed valid by them (5.7%–10.5%), including instances with security critical mis-parsings. We prove the criticality of such mis-parsing exploiting one of the syntactic flaws found in existing certificates to perform an impersonation attack. “

  1.  

  2. 2

    That sounds nice! Though with 21.5% of the certificates surveyed in the wild being invalid, I wonder if it would be more interesting to propose a grammar description that is more lenient than the standard. I’ll read up on the finer details, to see if they were able to classify the violations (But 32 pages I’ll likely only finish this reading when this thread has long since left the front page :))

    Also, it’s a bit unfortunate that they sourced certificates from talking to the public IPv4 space. I suppose they’ll miss a quite a few certificates due CDNs and virtual hosting. It seems likely that you’d get a more relistic set by looking at Certificate Transparency logs. The upside would be that you’d skew towards actual in-use production certs and avoid all sorts of oddities (e.g., self-signed certs). Building on top of a this would also be more useful to inform the suggestion better practices and standard improvements.

    1. 2

      Sounds like some great ideas! :)

      1. 1

        I don’t know if you have seen this yet, but did they also mention which ports they hit in the public space to get the certificates? I wonder how many open ports >1024 are out there with valid, publicly-signed certs. (I know of a few myself for sure, running ports like 8080 or 4567.)