Campus Units
Electrical and Computer Engineering, Computer Science
Document Type
Article
Publication Version
Accepted Manuscript
Publication Date
11-2014
Journal or Book Title
Journal of Parallel and Distributed Computing
Volume
74
Issue
11
First Page
3115
Last Page
3127
DOI
10.1016/j.jpdc.2014.07.008
Abstract
A persistent item in a stream is one that occurs regularly in the stream without necessarily contributing significantly to the volume of the stream. Persistent items are often associated with anomalies in network streams, such as botnet traffic and click fraud. While it is important to track persistent items in an online manner, it is challenging to zero-in on such items in a massive distributed stream. We present the first communication-efficient distributed algorithms for tracking persistent items in a data stream whose elements are partitioned across many different sites. We consider both infinite window and sliding window settings, and present algorithms that can track persistent items approximately with a probabilistic guarantee on the approximation error. Our algorithms have a provably low communication cost, and a low rate of false positives and false negatives, with a high probability. We present detailed results from an experimental evaluation that show the communication cost is small, and that the false positive and false negative rates are typically much lower than theoretical guarantees.
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
Copyright Owner
Elsevier Inc.
Copyright Date
2014
Language
en
File Format
application/pdf
Recommended Citation
Singh, Sneha Aman and Tirthapura, Srikanta, "Incremental Maintenance of Maximal Cliques in a Dynamic Graph" (2014). Electrical and Computer Engineering Publications. 169.
https://lib.dr.iastate.edu/ece_pubs/169
Comments
This is a manuscript of an article published as Singh, Sneha Aman, and Srikanta Tirthapura. "Monitoring persistent items in the union of distributed streams." Journal of Parallel and Distributed Computing 74, no. 11 (2014): 3115-3127. DOI: 10.1016/j.jpdc.2014.07.008. Posted with permission.