Practical aspects of `0-sampling algorithms

Juan P. A. Lopes; Fabiano S. Oliveira; Valmir C. Barbosa

doi:10.5753/etc.2018.3145

Juan P. A. Lopes
Fabiano S. Oliveira
Valmir C. Barbosa

DOI: https://doi.org/10.5753/etc.2018.3145

Resumo

The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1  i  t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.