Campus Units
Electrical and Computer Engineering, Computer Science
Document Type
Article
Publication Version
Submitted Manuscript
Publication Date
6-10-2019
Journal or Book Title
arXiv
Abstract
This paper investigates parallel random sampling from a potentially-unending data stream whose elements are revealed in a series of element sequences (minibatches). While sampling from a stream was extensively studied sequentially, not much has been explored in the parallel context, with prior parallel random-sampling algorithms focusing on the static batch model. We present parallel algorithms for minibatch-stream sampling in two settings: (1) sliding window, which draws samples from a prespecified number of most-recently observed elements, and (2) infinite window, which draws samples from all the elements received. Our algorithms are computationally and memory efficient: their work matches the fastest sequential counterpart, their parallel depth is small (polylogarithmic), and their memory usage matches the best known.
Copyright Owner
The Authors
Copyright Date
2019
Language
en
File Format
application/pdf
Recommended Citation
Tangwongsan, Kanat and Tirthapura, Srikanta, "Parallel Streaming Random Sampling" (2019). Electrical and Computer Engineering Publications. 220.
https://lib.dr.iastate.edu/ece_pubs/220
Comments
This is the pre-print of the article Tangwongsan, Kanat, and Srikanta Tirthapura. "Parallel Streaming Random Sampling." arXiv preprint arXiv:1906.04120 (2019). Posted with permission.