tdunning / t-digest
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
☆1,974Updated 9 months ago
Related projects: ⓘ
- Stream summarizer and cardinality estimator.☆2,260Updated 4 years ago
- A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others☆3,497Updated last week
- In-memory dimensional time series database.☆3,431Updated 2 weeks ago
- A High Dynamic Range (HDR) Histogram☆2,163Updated 2 months ago
- A crazy fast analytical database, built on bitmaps. Perfect for ML applications. Learn more at: http://docs.featurebase.com/. Start a Doc…☆2,530Updated 6 months ago
- A framework for distributed systems verification, with fault injection☆6,769Updated this week
- Distributed Prometheus time series database☆1,428Updated this week
- t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark☆381Updated last year
- Distributed storage for sequential data☆1,901Updated 2 years ago
- Apache Parquet Format☆1,744Updated 3 weeks ago
- Beringei is a high performance, in-memory storage engine for time series data.☆3,172Updated 6 years ago
- Secor is a service implementing Kafka log persistence☆1,844Updated last month
- Volt Active Data☆2,120Updated 5 months ago
- A software library of stochastic streaming algorithms, a.k.a. sketches.☆888Updated this week
- Apache Parquet Java☆2,558Updated 3 weeks ago
- Apache Drill is a distributed MPP query layer for self describing data☆1,928Updated 3 weeks ago
- Fast scalable time series database☆1,735Updated 4 months ago
- Mirror of Apache Samza☆811Updated 3 weeks ago
- What are the differences between the transaction isolation levels in databases? This is a suite of test cases which differentiate isolati…☆2,424Updated 2 months ago
- WiredTiger's source tree☆2,190Updated this week
- Berkeley Tree Database (BTrDB) server☆906Updated 3 years ago
- Apache Avro is a data serialization system.☆2,891Updated this week
- MacroBase: A Search Engine for Fast Data☆660Updated last year
- A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…☆2,214Updated last week
- High-performance time-series aggregation for PostgreSQL☆2,633Updated 2 years ago
- Parsing and analysis of Vertica, Hive, and Presto SQL.☆1,073Updated 2 years ago
- FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.☆3,002Updated 11 months ago
- A java agent to generate method mappings to use with the linux `perf` tool☆1,645Updated 4 years ago
- HyperLogLog with lots of sugar (Sparse, LogLog-Beta bias correction and TailCut space reduction) brought to you by Axiom☆937Updated 3 weeks ago
- Time-series database☆837Updated 2 years ago