Threads for badroot

  1. 3

    What is there that’s genuinely interesting and exciting in the world of software, and what was your way of engaging your interest and excitement?

    Full disclosure: I work for a company which does secure analytics, so an improved perception of the state of the art is something that would benefit us. I’m writing this independently, and I’ll be as objective as possible.

    Based on my disclosure, you can probably tell I’m going to discuss secure analytics. I work with two different technologies to do this, Secure Multiparty Computation and Homomorphic Encryption, and to avoid going too in-depth for this post, I’ll summarize the critical point: both technologies allow you to perform computations on and between encrypted data.

    Being able to offer fully encrypted analytics services, in the short term view, lets businesses keep their own data totally secure while still being able to outsource computing power. Beyond that, there are strategies to also allow computation between multiple sets of data from different sources, letting data owners perform analytics over joined data sets while never giving other parties (collaborator or not) access to their data. It’s an exciting technology in its own right, and being able to work on and implement tech like this still feels rewarding, but what really drives me is the long term impact that this sort of technology can have for the world.

    With the “new” boom for data science and AI, we’re in this tricky place where your data is the property and capital of all of the services you use. To me, it makes sense that any service wants more data, as it gives them more meaningful insight, but because everyone wants data, it becomes valuable, and the more valuable it becomes, the more difficult it becomes to do collaborative analytics, usually due to cost of data access or security protocol certification. I’m particularly interested in medical information as a few members of my family suffer from diseases like Type 1 Diabetes, which seem to require a large set of attributes per individual (collaboration between different kinds of service providers), and large numbers of individuals (collaboration between data owners who provide similar services) to get meaningful insight in studies.

    If it’s possible to allow data providers to to collaborate in a simpler way while still guaranteeing total security of any individual’s personally identifying information (PII), we can get so much more information about the problems which face the world today in almost any context. Eventually, we may even be able to have a system in which users own their own data, and can provide it as they see fit, while their information remains totally secure.

    That idea is what really keeps me going, and there is a long way to go until we get there. Secure analytics like MPC and HE are still moving into more universal practicality, and they don’t solve all of the issues in front of us on their own. We still might need things like Private Set Intersection (PSI) and Private Information Retrieval (PIR) to perform more anonymous filtering, Differential Privacy to really guarantee anonymity for participant information, and much more work into Adversarial ML in the encrypted space where we can’t extrapolate information from our data as readily. Some of this is research work, some of it is implementation, and most of it will require multiple rounds of research and implementation until we find a good solution.

    On the larger software side of this, the community, especially the Homomorphic Encryption community, has been working on sets of standards to close the gap between the research implementation and production code, using better software engineering ideas to motivate the next generation of designs. It’s difficult to impart the scale and interesting-ness without going in to lower-level detail about why we (might, and probably do) need new definitions of things like operators and operands in HE contexts, so again, for the sake of brevity, I’ll leave it here.