google / zetasketch
A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: HyperLogLog++; more to come.
☆152Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for zetasketch
- Apache datasketches☆87Updated last year
- GCS support for avro-tools, parquet-tools and protobuf☆73Updated this week
- DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees.☆113Updated 4 months ago
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆191Updated this week
- Idempotent query executor☆50Updated 9 months ago
- Website for DataSketches.☆95Updated this week
- HyperLogLog (original and hyperloglog++) algorithm implementation in java.☆81Updated 3 years ago
- Cache File System optimized for columnar formats and object stores☆183Updated 2 years ago
- Collection of utilities to allow writing java code that operates across a wide range of avro versions.☆76Updated 2 months ago
- A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.☆129Updated 4 months ago
- High performance native memory access for Java.☆122Updated this week
- Sketch adaptors for Hive.☆48Updated last month
- Harry for Apache Cassandra®☆54Updated 2 months ago
- This repository provides Scotty, a framework for efficient window aggregations for out-of-order Stream Processing.☆75Updated last year
- TLA+ specification of the Kafka replication protocol☆86Updated 4 years ago
- Spark SQL index for Parquet tables☆132Updated 3 years ago
- Albis: High-Performance File Format for Big Data Systems☆21Updated 6 years ago
- Union, intersection, and set cardinality in loglog space☆54Updated last year
- Enabling queries on compressed data.☆278Updated 10 months ago
- A Scalable Concurrent Key-Value Map for Big Data Analytics☆268Updated 9 months ago
- ☆104Updated last year
- ☆64Updated last week
- Scala Aggregators used for ML Model metrics monitoring☆91Updated last year
- Self regulation and auto-tuning for distributed system☆64Updated last year
- Hadoop output committers for S3☆108Updated 4 years ago
- ☆36Updated last year
- Immutable key/value store with efficient space utilization and fast reads. They are ideal for the use-case of tables built by batch proce…☆96Updated last year
- Probabilistic data structures for Guava.☆54Updated 4 years ago
- A lightweight workflow definition library☆154Updated 2 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated 6 months ago