Free for All! Assessing User Data Exposure to Advertising Libraries on Android


In this work, we systematically explore the potential reach of advertising libraries through these channels. We design a framework called Pluto that can be leveraged to analyze an app and discover whether it exposes targeted user data—such as contact information, interests, demographics, medical conditions and so on—-to an opportunistic ad library. We present a prototype implementation of Pluto, that embodies novel strategies for using natural language processing to illustrate what targeted data can potentially be learned from an ad network using files and user inputs. Pluto also leverages machine learning and data mining models to reveal what advertising networks can learn from the list of installed apps. We validate Pluto with a collection of apps for which we have determined ground truth about targeted data they may reveal, together with a data set derived from a survey we conducted that gives ground truth for targeted data and corresponding lists of installed apps for about 300 users. We use these to show that Pluto, and hence also opportunistic ad networks, can achieve 75% recall and 80% precision for selected targeted data coming from app files and inputs, and even better results for certain targeted data based on the list of installed apps. Pluto is the first tool that estimates the risk associated with integrating advertising in apps based on the four available channels and arbitrary sets of targeted data.

In the Annual Network and Distributed System Security Symposium.