Engineering Blog

Data data quality

Counting Towards Infinity: Next Generation Data Warehousing (Part I)

In this multipart series, "Counting Toward Infinity", we'll explore various approximation-based sketches to handle problems encountered in our most expensive queries. Lets start with the technical details of our first sketch: HyperLogLog, a linear time, constant space algorithm for estimating multiset cardinality that Turn uses to efficiently estimate the number of unique records that satisfy a given input query.

Read more