Degree Type
Thesis
Date of Award
2013
Degree Name
Master of Science
Department
Electrical and Computer Engineering
First Advisor
Srikanta Tirthapura
Abstract
We consider continuous maintenance of a random sample of distinct elements from a massive data stream, whose input elements are observed at multiple distributed sites that communicate via a central coordinator. At any point, when a query is received at the coordinator, it responds with a random sample from the set of all distinct elements observed at the different sites so far. We present the first algorithms for distinct random sampling on distributed streams. We also present a lower bound on the expected number of messages that must be transmitted by any distributed algorithm, showing that our algorithm is message optimal to within a factor of four. We present extensions to sliding windows, and detailed experimental results showing the performance of our algorithm on real-world data sets.
DOI
https://doi.org/10.31274/etd-180810-1184
Copyright Owner
Yung-Yu Chung
Copyright Date
2013
Language
en
File Format
application/pdf
File Size
41 pages
Recommended Citation
Chung, Yung-Yu, "Distinct random sampling from a distributed stream" (2013). Graduate Theses and Dissertations. 13863.
https://lib.dr.iastate.edu/etd/13863