Campus Units

Electrical and Computer Engineering

Document Type

Article

Publication Version

Accepted Manuscript

Publication Date

2015

Journal or Book Title

Algorithmica

Volume

73

Issue

2

First Page

235

Last Page

260

DOI

10.1007/s00453-014-9917-1

Abstract

On a stream S" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">SS of two dimensional data items (x,y)" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">(x,y)(x,y) where x" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">xx is an item identifier and y" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">yy is a numerical attribute, a correlated aggregate query C(σ,AGG,S)" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">C(σ,AGG,S)C(σ,AGG,S) asks to first apply a selection predicate σ" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">σσ along the y" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">yy dimension, followed by an aggregation AGG" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">AGGAGG along the x" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">xx dimension. For selection predicates of the form (y(y(y>c)(y>c), where parameter c" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">cc is provided at query time, we present new streaming algorithms and lower bounds for estimating correlated aggregates. Our main result is a general method that reduces the estimation of a correlated aggregate AGG" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">AGGAGG to the streaming computation of AGG" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">AGGAGG over an entire stream, for an aggregate that satisfies certain conditions. This results in the first sublinear space algorithms for the correlated estimation of a large family of statistics, including frequency moments. Our experimental validation shows that the memory requirements of these algorithms are significantly smaller than existing linear storage solutions, and that these achieve a fast per-record processing time. We also study the setting when items have weights. In the case when weights can be negative, we give a strong space lower bound which holds even if the algorithm is allowed up to a logarithmic number of passes over the data. We complement this with a small space algorithm which uses a logarithmic number of passes.

Comments

The final publication is available at Springer via https://doi.org/10.1007/s00453-014-9917-1. Tirthapura, Srikanta, and David P. Woodruff. "A general method for estimating correlated aggregates over a data stream." Algorithmica 73, no. 2 (2015): 235-260. Posted with permission.

Copyright Owner

Springer Science+Business Media New York

Language

en

File Format

application/pdf

Published Version

Share

COinS