As a follow up to the https://lobste.rs/s/0c0tvx/what_email_client_do_you_use story, I’m curious what people’s strategies for archiving / accessing the archive of mail. I’m increasingly concerned about the idea of leaving many gigabytes of really personal email on some server owned by some company in the cloud. What approaches / tooling are out there to make reducing this threat surface area? Is it all just local mail + offline encrypted backup, or is there something better?
That’s basically what I do… I use
mbsync
to download it, andnotmuch
to tag things asarchive
. My local maildir is backed up / encrypted periodically.I’m not sure what else would be possible… you either do that, or do basically the same thing but on someone else’s computer (the “cloud”)
I don’t really archive anymore. It’s all just text and so it’s not so much, even after 15 years. As I self-host, it’s just a Maildir with old and new mails. Backing that up with deduplication is painless.
The worry that I have is not so much the cost of the storage, it’s the fact that a compromise in any mail client gives access to all of that mail. I’ve not seen any support in mail servers for spotting unusual traffic patterns and requiring 2FA to reactivate a per-client key, for example.
Fair point, but it’s a tradeoff I am willing to make.
This is part of my concern too.
I kind of blend the two: 1) I aggressively delete emails, and 2) archive what I feel would be an incredible loss. For example, I don’t save any emails that are just manifestations of some service’s history, such as a receipt for Apple Card payments, or other statement emails. With this approach I end up with less than 100 emails a year, but I do have to set the expectation that these external services will make the data easily available.
I actually delete mail a lot rather than keeping everything around.
Yeah, this has been my strategy for years. I just delete majority of emails, and never had issues. There was a period when I diligently archived the emails, but now I realize I never went to read them.
My email is hosted with Migadu. I sync emails down to my NAS periodically with offlineimap after which they’re backed up in Backblaze via Duplicity.
I like this approach because I can swap out my domain name provider, email host, backup provider, with minimal impact upon any of the other services.
I host my own on a dedicated server running OpenBSD. A nightly job encrypts and sends to Backblaze’s B2. I test restores quarterly.
I run my own mail server, use IMAP, so the backup system backing up my server also backs up my email. Worked find for the past fifteen years.
If you don’t trust your email provider, I don’t think backing up and deleting is a good approach, as you have no clue what the provider does on deletion. And it might be basic stuff. Maybe they keep backups, maybe they are faulty, maybe they get hacked, maybe some rogue employee does something not so good.
Without wanting to get too off-topic, but also with a lot of “secure” email providers (and cloud infrastructure at large) people base their company on marketing claims when it comes to security. I think it might make sense to reduce the risk.
Now I am very aware, that a lot of people don’t want to run their own email server, but then something like E2E encryption would make sense. I wonder if a good approach to not running a full email provider could be to have some basic image, just forwarding incoming emails in an encrypted way, which means storage and sending could still be handled by the provider.
Note that addressing this is the goal of Confidential Cloud Computing. The hardware removes the hypervisor from the TCB for confidentiality and integrity (but not for availability). Any page assigned to the VM is encrypted and its contents cannot be accessed by the host. The initial boot memory contents is attested by the hardware (or by some attested software) and so you can check that you’re booting the kernel you expect before you provide it with the decryption keys for the disk images that it has attached. If you use something like dm-verity, then you can also validate that the cloud provider is not tampering with the disk contents (unfortunately, there isn’t yet a read-write dm-verity, so you have to fall back to something like dm-integrity, which is vulnerable to replay attacks).
An email provider building an offering on these technologies would be able to give you verifiable guarantees that they couldn’t see the contents of your email (they could probably leak some of it via side channels, but that would require a targeted attack, in contrast to the situation today where a malicious actor in the company could just copy all of everyone’s email). Endpoint security becomes very important in that case because the easiest way of stealing all of your email is to compromise one of your client devices.
You’re exactly right, of course, but I always chuckle a little bit when I read something like this. It seems to say “Now, in order to be secure, all we need to do perfectly is the exact thing that we’re the very worst at.”
It’s certainly a worthy improvement over the current cloud situation, to be clear. But I think there’s an argument to be made that it is easier for an organization to run their own server and have dumber clients, and I wouldn’t be terribly surprised to see experiments this direction.
Sad, but true.
Even with a dumber client, it’s not clear you get an advantage. If the dumb client can list emails and download them (or even view them) then a compromise to the client can do the same.
This is what makes me incredibly nervous about webmail systems. When I open an HTML email in Apple’s Mail.app, it uses WebKit2’s sandboxing support to decode the HTML and generate the display in a separate, unprivileged, process. If there’s a vulnerability in WebKit then a random person who sent me an email can compromise an unprivileged process and hopefully not get any further. They need to also find a kernel vulnerability that they can exploit to be able to make a network request, for example. In contrast, if I read an email in a WebMail client in Safari on the same system, the entire web app runs in the same renderer process. A malicious email that exploits a vulnerability in the same WebKit that Mail.app uses can now do anything that I can do in that web app, including downloading all my mail to the browser to scan for sensitive messages, forward mail to other people, and so on.
I wonder if there’s an easy way to configure Mail.app to run against an offline store (mbox / maildir). Perhaps the right way is to run a local IMAP server over the offline store?
Dovecot can work as this kind of proxy quite easily - you can configure it to use a local mail store and not accept delivery and also to use a remote IMAP server if you want to be able to collect mail from there.
I think this might be where I eventually land. Cloud based mail server, synced locally, encryption at rest, encrypted cloud backup. The big part of this for me is the part where I don’t want a bunch of emails sitting on a box in the cloud for a long period of time in an unencrypted form.
Yeah, the ingress insecurity is a problem, but then don’t most email servers use TLS for in transport security?
I run an imap server on my home server and use mail.app to drag messages from my Archive folder to an Archive folder on that home server. The home server is then backed up to cloud storage using restic.
I just do a Takeout every 6-12 months, Then convert it all the maildir, then use notmuch and netviel to view any old emails.
I could also use https://github.com/jay0lee/got-your-back it does backups and restores of Gmail. I used ti to migrate from a google apps account to a free gmail account no problem. Works super well.