BackBlaze acknowledged this and pushed out a fix. Facebook’s SDKs are notorious for recording far more data than necessary as noted here, so I don’t feel BackBlaze was shipping off data intentionally, and were blindsided by Facebook changing things under them.
BackBlaze is responsible for the code on their website. If they ship code in their web app which ships all the names of the user’s files to Facebook, that’s on them. This is a huge violation of trust from BackBlaze. “A library did it” isn’t an excuse.
I completely agree, it is certainly a grave mistake on their part. What I meant was that this incident appears to be a result of carelessness rather than malice.
Ah, makes sense. That is indeed an important thing to point out.
Case or “Never attribute to malice that which can be adequately explained by stupidity.”?
Never attribute to malice which can be adequately explained by passing the buck to a library♥
Absolutely, I mean what did they expect would happen when they include some tracking garbage from facebook? I evaluated them and eventually planned to use them as a block storage provider but canceled my account with them today after I read about the tracking pixel. There’s absolutely zero reason for including this tracking stuff in the admin part of the website.
The only mitigation I can think of is to code-review (at some level) all diffs of all dependencies (transitively), when any first-level dependency changes.
It’s even worse if some libraries are loaded from a third party, which could change them at any time.
I think that is a lot of difficult, challenging work.
Is there a better idea than the one above? Or is that just the cost of doing business and the best approach would be for us to somehow distribute the load (e.g. a 3rd party, curated, checked, trusted JS stack which covers a common set of modules.
The mitigation here is substantially simpler, don’t include code loading from or sending data to 3rd parties on pages that contain sensitive business and personal information that you are obligated to protect. Especially when that’s your core business.
People would be much more understanding of this issue if it was a supply chain attack, it wasn’t, they intentionally included scripts from third parties where there shouldn’t have been any. That the scripts were extracting slightly more data than they thought… really isn’t the issue.
But why would you like to integrate your customers admin panel with Facebook? It compromises their privacy and your company secrets.
The only reason I can imagine is measuring conversions, but again is it worth the risks?
Well, it’s a trade-off isn’t it. In theory, code reviewing (and self hosting!) every dependency could provide the best security. That’s feasible if you’re comfortable with using few dependencies, but it might not always be possible.
If you’re not going to be reviewing your dependencies though, the very least you should do is to reflect over whether the dependency is managed by someone who you have reasons to believe aren’t going to do anything creepy. I would, for example, probably trust jQuery, because they don’t (AFAIK) have a history of being creepy. Do we have a reason to trust Facebook to not be creepy? Absolutely not. So maybe don’t use their tracking library.
Above all that though, host your code on your own damn servers. There’s no good reason to give a library vendor (or an attacker with access to your library vendor’s web server) the technical ability to inject arbitrary code into your app just by changing a file on their end. This should be an obvious thing just from a reliability perspective too. Thanks to Hyrum’s law, every change is a potential breaking change, so it seems ridiculous to effectively push new versions of dependencies to customers with no testing.
At this point, being surprised by the Facebook SDK doing bad things is like being surprised when you’re cut off from the App Store or Google Play without redress.
It’s the nature of the beast, and I’m disappointed when the companies involved trot out the “poor us” or “we didn’t know” messaging.
This should be folded into https://lobste.rs/s/blhsea (heads up @pushcx)
Just to be clear, it sent the names of files in the payload, not their contents. That’s still a breach of trust, and potentially a very serious one for some use cases, but this headline overstates the scale of the problem.
From what I can see it’s at least names, sizes and modification(?) dates. So names understates it.
But yes, it really depends on the use case. I certainly think on a backup service there’s a chance that a file name maybe mentioning some activity (health related for example) and a date might be more compromising than an out of context content (random x-ray scan) might be.
Of course in the given example that’s probably not the biggest issue, but who knows how many products have the potential to use backlaze accounts in various contexts. Tying file names important enough to be backed up professionally to Facebook profiles seems hard overstate.
Just a quick update: they’ve looked into and verified the issue and have pushed out a fix. They will continue to investigate and will provide updates as they have them.
Can you stop saying this and make some kind of actual commitment that you will actually protect the privacy of your users? This is a massive violation of privacy, more needs to be done to prevent this happening again. I’m glad that all my data in B2 (several TB) is only access via Arq and I almost never use the web UI, but others definitely won’t be that lucky, and now Facebook knows possibly extremely sensitive data from your users. John’s going to be very upset to find out that Facebook know he has HIV Positive Test Results - John Doe.pdf in his bucket.
HIV Positive Test Results - John Doe.pdf
For context: https://lobste.rs/s/blhsea
The long and short of it is that for ~2 months Backblaze’s usage of a Facebook “pixel” on signed-in pages resulted in Facebook getting the names and sizes of files displayed in the B2 web UI.