Date of Award
Master of Science
Cross-device tracking has drawn growing attention from both commercial companies and the general public because of its privacy implications as well as applications for user profiling, person- alized services, and user authentication. One particular, widely-used type of cross-device tracking is to leverage browsing histories of user devices, e.g., characterized by a list of IP addresses used by the devices and domains visited by the devices. State-of-the-art browsing history based methods compute a similarity score for a device pair using only the common IPs used by both devices and domains visited by both devices, and leverage supervised machine learning. These methods cannot capture latent correlations among IPs/domains and require a large amount of labeled device pairs, which is time-consuming and costly to obtain.
In this work, GraphTrack, an unsupervised graph-based cross-device tracking framework, to track users across different devices by correlating browsing histories on these devices is proposed. Specifically, the complex interplays among IPs, domains, and devices are modeled as graphs to cap- ture the latent correlations between IPs/domains. Moreover, random walk with restart is adapted to compute similarity scores between devices based on the graphs. GraphTrack leverages the sim- ilarity scores to perform unsupervised cross-device tracking and can be extended to incorporate manual labels if available. GraphTrack is evaluated on a real-world dataset. The results show that GraphTrack substantially outperforms the state-of-the-art method, e.g., by 13% in Accuracy.
Zhou, Tianchen, "GraphTrack: An unsupervised graph-based cross-device tracking framework" (2018). Graduate Theses and Dissertations. 17377.