Abstract
This entry formalizes a randomized cardinality estimation data structure with asymptotically optimal space usage. It is inspired by the streaming algorithm presented by Błasiok in 2018. His work closed the gap between the best-known lower bound and upper bound after a long line of research started by Flajolet and Martin in 1984 and was to first to apply expander graphs (in addition to hash families) to the problem. The formalized algorithm has two improvements compared to the algorithm by Błasiok. It supports operation in parallel mode, and it relies on a simpler pseudo-random construction avoiding the use of code based extractors.
License
Topics
- Computer science/Algorithms/Distributed
- Computer science/Algorithms/Approximation
- Computer science/Algorithms/Randomized
Related publications
- Karayel, E. (2023). An Embarrassingly Parallel Optimal-Space Cardinality Estimation Algorithm. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPICS.APPROX/RANDOM.2023.35
- Karayel, E. (2023). An embarrassingly parallel optimal-space cardinality estimation algorithm (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2307.00985
Session Distributed_Distinct_Elements
- Distributed_Distinct_Elements_Preliminary
- Distributed_Distinct_Elements_Balls_and_Bins
- Distributed_Distinct_Elements_Tail_Bounds
- Distributed_Distinct_Elements_Inner_Algorithm
- Distributed_Distinct_Elements_Accuracy_Without_Cutoff
- Distributed_Distinct_Elements_Cutoff_Level
- Distributed_Distinct_Elements_Accuracy
- Distributed_Distinct_Elements_Outer_Algorithm