As data from far-reaching sources is collected, aggregated, and re-packaged to enable new and smarter applications, confidentiality and data security are at greater risk than ever before. Some of the most surprising and invasive threats to materialize in recent years are brought about by so-called inference attacks: successful attempts to learn sensitive information by leveraging public data such as social network updates, published research articles, and web APIs.
In this talk, I will focus on two of my research efforts to better understand and defend against these attacks. First I will discuss work that examines the privacy risks that arise when machine learning models are used in a popular medical application, and illustrate the consequences of applying differential privacy as a defense. This work uncovered a new type of inference attack on machine learning models, and shows via an in-depth case study how to understand privacy “in situ” by balancing the attacker’s chance of success against the likelihood of harmful medical outcomes. The second part of the talk will detail work that helps developers correctly write privacy-aware applications using verification tools. I will illustrate how a wide range of confidentiality guarantees can be framed in terms of a new logical primitive called Satisfiability Modulo Counting, and describe a tool that I have developed around this primitive that automatically finds privacy bugs in software (or proves that the software is bug-free). Through a better understanding of how proposed defenses impact real applications, and by providing tools that help developers implement the correct defense for their task, we can begin to proactively identify potential threats to privacy and take steps to ensure that they will not surface in practice.