-
- What are the best practices in internal-service communication?
- Should all services be talking to each other through an application-level encrypted channel?
- Should they be running in a secure network instead and be able to trust anything in that network?
- Is there another option?
- What are the pro’s and con’s of various solutions?
- What are people doing in the industry?
3 . Don’t assume the network is safe. Don’t assume that the servers are who they say they are.
2, after reading 3: Yeah I’m not sure what you mean by that but I think this is more safe.
6 . In my case it’s a strange mixture of “Let’s build it safe and proper” and “I’m out of time because the stakeholder says so.”
You can do encryption at the OS layer, with something like ipsec. Or at the application layer with HTTPS. Is there a reason to do it at the application layer?
How do you actually do this in a safe way? For HTTPS you need certificates, what are people using to manage certificates in a way that is safe and effective?
Also, one does not have control over all protocols. For example, the protobufs interface to Riak is just a raw socket. What should one do in this case the best practices is to do application-level encryption of the communication channels?
3 - Unless riak has some option to do that, I’d slap on a tiny layer in front of that to sign or encrypt “requests”. Upon bad signature/encryption, fail, upon good signature, pass the thing to riak. Can’t promise this is a good idea, but it’s probably better than potentially exposing the socket to unknown clients. Also this goes without saying that should you go down that route, you should definitely not roll out your own crypto.
I’ll stress that this is only me, it seems like a decent enough solution. You probably can pull off something with SSL certificates too.
But why can’t I just do this at the OS layer with something like IPSec? What is the value added by doing it at the application-layer?
HTTPS is a poor solution to this problem at scale. I’ve seen:
I’ve basically given up and said that if the data should be encrypted, I’m kicking it up to the OS where I have more ability to monitor and control access.
That being said, I’ve done this well on hardware and networks that “I” (the company) owns. If this was a VM running in an AWS data center somewhere, I might have more heartburn. But at that point if I convince myself that I can’t trust the hardware, the OS, or the copper… I don’t see any reason to think that application level authentication should be trusted either.
We require encryption on the wire for almost everything. To make this somewhat easier to handle and support, most of our services are HTTPS. We also make heavy use of HTTP Basic auth, for authentication purposes.
You can’t secure the network, and often these days it’s not even your network. Application-level authentication is the way to go, because then you just don’t care about the transport layer. At some stage I’d like to write an SOA that used OAuth for everything even internally, but for now in industry I’ve used simpler approaches (API accounts and HTTPS, basically).
@olivier also gave a similar advice: “Don’t assume the network is safe. Don’t assume that the servers are who they say they are.”
I’d like to know what both of you think of the virtual network features offered by hosting platforms like Amazon Web Services (Virtual Private Cloud) or Google Compute Engine. They give you a virtual network with its own address space, where only your hosts can communicate with each other. Do you think it’s not secure enough? Why?
Intuitively, I would tend to agree with both of you and place more trust in an end-to-end encryption than in some intermediary layer which can be subject to configuration mistakes or even bugs in the networking software.
My experience using Amazon VPC was a frustrating one where features that should have been supported weren’t, or interacted wrong. It may have got better now. I suspect they’ll be at least as secure as one’s own system when they’re working.
Is there any kind of open standard for these things? If you use standard, non-Amazon/Google-specific tools like fabric and puppet, it’s pretty possible to more-or-less decouple your infrastructure from any specific provider, which is a very nice position to be in - and if you want to move a few services onto dedicated hosting, you can do that without worrying about the security implications, because you were never trusting the network in the first place. I worry that integrating VPC would lock you into Amazon.
I don’t know if I’m just being ridiculously paranoid, but I find that with things like SSL you can at least guarantee (to some level, assuming there’s at least the same amounts of flaws in SSL than in other solutions) machine-level authenticity, whereas with VPC or other “I promise you this network is secure” solutions assume that the traffic is safe if you are in the network.
Put simply, it means that you only need to compromise access to the network to gain access to its component’s services, whereas you’d need to compromise a machine’s certificates to assume that machine’s identity for other machines that usually accept traffic from that compromised machine.
Attacking AWS VPC means attacking Amazon’s host machines that talk to each other using ipsec. I think if your threat model includes actors that could viably perform such an attack – hey, that actor has a three letter acronym too! – you’ve got to discount using any cloud service from the get go.
Put another way, if the actor in your threat model can break into the VPC hosts, why can’t they break into the hosts that are hosting your SSL-enabled services?
So no, I don’t think you’re ridiculously paranoid at all – you’re just not paranoid enough! ;-)
Totally agree; I don’t assume that one is more secure than the other in my post, but say that if both these things can indeed happen, it feels (to me) more secure to not trust the network from the get go. Then again, there’s no good reason to trust any external machine, since they can be compromised without us knowing. Heck, why should we even trust the host machine? I’m willing to assume from now on that I’m forever screwed and measure security in terms of risk rather than in terms of trust.
Yes, but if a machine is compromised, the attacker can keep the existing TLS certificates, and thus can still continue to communicate with your other machines. I understand what TLS brings in terms of security if the network is compromised, but I don’t understand what it brings if an host is compromised.
Likewise, it could be hard to prove that the network you join isn’t compromised. Also, for me, it’s easier to invalidate a certificate than it is to rebuild a network, or make it secure again. Which is why I’m talking about assuming that everything is compromised and mitigate risk instead.
You’ve got two services,
foo.example.comandbar.example.com. The attacker owns the two AWS hosts that run your services. They’re reading the RAM of both of your guest VMs. They’re forwarding all in/outbound traffic fromfoo:em0to their guest VM that decrypts the packets with the keys they read right out of your domU.What do you see? Normal traffic from
foo:em0<->bar:em0. Why would you think you’d need to invalidate the certificate?You’re smart and are using a SSL mode that supports PFS. Great, but they own dom0 and are passing session keys along as well. Hell, maybe they’re a real advanced threat and take it one step further and are just slurping stuff out of RAM post-decryption. No network foolin' required!
Contrast that with AWS VPC, where a packet from 10.0.0.2 -> 10.0.0.3 actually goes DomU -> Dom0 -> Dom0 (assuming your DomUs aren’t luckily on the same box) -> DomU. I believe VPC uses ipsec on the dom0s to join the network, in which case that getting to the packet still means you have to attack dom0. And that’s just as game over. AWS is a bit cagey on the specifics of their approach, though, so it’s possible that’s not the case.
I think the fundamental point I’m making here is that any AWS attack that would break into a VPC would mean breaking into the host that runs your VM, at which point your battleship has been sunk.
This isn’t to say you shouldn’t encrypt things - getting into the habit of encrypting everything is a Very Good Idea because it almost is always the right decision. It’s just that I think you’re drawing a distinction between two attacks that are fundamentally identical.
Nice analysis! Thanks!
I’m not meaning “for me” as in “in my opinion” though, I’m meaning it as in “for my personal skillset”. Is the concern the same if you use some other os-level VLAN? It may just be a displacement of the issue; I have never used AWS VPC, I’m not familiar with it, that’s why I used “a network”. Seeing what you say about VPC then for sure I agree then. Assuming everything is compromised doesn’t mean dropping all protections, it’s a question of trust vs risk, in my mind. Measuring impact, and ideally, mitigating it as much as you can. idk.
That’s a really good point. If the attacker is able to hack a virtual network like VPC (for example by rerouting traffic to a machine he controls), then it is also probably able to break into yours hosts, and in a such case using TLS or not does not change anything.
The more I think about it, the less I see the purpose of encrypting traffic inside the data center, if I control the hardware, the network, and the people in this data center. If the attacker get access to a switch or a router in my data center, in order to spy my communications, then he is probably able to get access to my hosts, or would it be harder? Am I missing something?
It looks like @stig has answered my question in another comment. It is easier for the attacker to listen to my traffic without me knowing by attacking the network, than by attacking the host. If the traffic is unencrypted, getting access to a switch or a router is enough to listen to the traffic. If the traffic is encrypted, you have to attack the host itself, and it’s harder to do without being “seen”.
Edit: I’m not so sure about to this. With Xen/KVM virtual hosting, if the attacker controls the Xen/KVM host, I guess he could access to a lot of information stored in RAM and on disk, without the Xen/KVM guest knowing it.
AWS has a good whitepaper on their security that might be illuminating. They reference “packet sniffing protection” for their network but I’m not sure what exactly that means.
The other half of the equation that most people don’t think about: if you aren’t going to trust AWS, who will you trust? Yourself? Are you sure you/your company has the experience and processes in place to sufficiently harden internal servers as well as, if not better than, Amazon?
Me, I drop ipsec on my EC2 instances and call it a day. And that’s on instances that aren’t sensitive enough that they really require encryption. Just habit, I guess.
I know this paper :) The “packet sniffing protection” means that Xen instances are unable to “sniff” traffic intended to be received by other hosts on the same network. This is totally standard. Even providers like DigitalOcean or Linode, which do not support private virtual networks, do it. The paper does not say anything about the kind of attack we are talking about in this thread.
I fully agree with your paragraph about trusting AWS versus trusting myself. It’s probably reasonable to trust them :)
About ipsec, do you mean that you enable ipsec on all your EC2 instances and channel all your communications through it?
I kinda figured that was their approach. I’m a bit bummed they’re not more explicit about how they form their VPCs. My understanding was that they’re pooling a bunch of disparate hosts under a VPN but that was from a while back and my memory is foggy…
Pretty much, yeah. I haven’t used EC2 in a while so things have probably changed, but I remember there were some headaches around NAT traversal to get around how AWS handled ipsec traffic and dynamic IPs. It was really helpful when I had a handful of instances with services that didn’t play well with HTTPS, and I had a local ipsec gateway that I wanted to get connected at the same time.
They probably “overlay” the virtual network over the real network. Here is a very interesting talk that partly explains how they do it at Google: https://www.youtube.com/watch?v=n4gOZrUwWmc
How do you handle the secrets in a secure way though?
No clue. You mean, more than using ssh and trying not to pass them over the net in cleartext? No clue.
This is why I don’t quite buy the advice of “yes encrypt everywhere, especially if you want machine authentication”. One has to distribute the shared secrets somehow, and I haven’t heard anyone address that in a way that doesn’t invalidate teh premise of using encryption for identity.
You are right in saying that the APIs used to setup the network in AWS or GCE are specific to each provider. But in my experience the lock-in is really minimal if your network configuration is relatively simple and flat.
Some providers like DigitalOcean or Linode does not offer a real private network. I agree with you that it’s easier to move to such a provider if you don’t rely on the network being private.
According to the defense in depth principle, the best solution is probably to combine private networking (like Amazon VPC) and end-to-end encryption.
It depends on if you trust AWS :-)
You can peer two VPCs, which lets you route traffic from one private subnet to another. Thus I suspect that if AWS want to (or are forced to) they can at least listen in on your network traffic without you knowing.
Personally I’m not that paranoid, but my project is more about how quickly we can get retail product data in front of as many eyeballs as possible…
I agree that if AWS want to or are forced to they can listen to my network traffic without me knowing. But if I encrypt my traffic using TLS, I think they would still be able to spy my Xen instance by snapshotting the RAM and/or intercepting IO operations to my EBS disks.
If the “attacker” is AWS, then I’m not sure that encrypting my traffic will really bring any added security. It just makes things a little bit more difficult, but not that much. For example, they can snapshot my EBS disk, retrieve my TLS certificates, and use them to decrypt my traffic. :)
Your question reminds me of Google’s response to Snowden’s revelations. Before Snowden’s revelations, Google was not encrypting communications between its data centers because they were using their own private network and their own fiber, and they supposed this network could be trusted. They didn’t think the NSA could be tapping the fiber itself… It looks like Google is now encrypting communications even when they are between their own data centers on their own network. Maybe this is another example in favor of an end-to-end encryption.
I do not think that NFS or Red Hat Cluster Suite encrypt anything. Even if the networking have converged on IP, it does not mean that you cannot treat it like PCIe when you implement local communication.
Designing a web service or something similar? Encrypt.