umbrant / QuantileEstimation
Streaming estimation of percentiles, especially high percentiles.
☆63Updated 12 years ago
Related projects ⓘ
Alternatives and complementary repositories for QuantileEstimation
- A collection of algorithms for mining data streams☆201Updated 11 months ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆62Updated 8 years ago
- Bloofi: A java implementation of multidimensional Bloom filters☆78Updated 8 years ago
- Enabling queries on compressed data.☆278Updated 11 months ago
- Persistent Adaptive Radix Trees in Java☆79Updated 4 years ago
- ☆92Updated 9 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆113Updated 3 years ago
- Low latency, strong consistency, fault tolerant distributed key value store. Colocate data and compute to achieve best performance cloud …☆113Updated 9 years ago
- Simulating the performance of various streaming algorithms. #experimentalmathematics☆59Updated 6 years ago
- Bitmap compression using the CONCISE algorithm☆43Updated 7 years ago
- Probabilistic data structures for Guava.☆54Updated 4 years ago
- Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark☆31Updated 6 years ago
- Reduce your data. A unix filter for algebird-powered aggregation.☆138Updated 7 years ago
- Secondary index on HBase☆18Updated 9 years ago
- Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.☆24Updated 8 years ago
- Schema and type system for creating sortable byte[]☆48Updated 11 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆83Updated 2 years ago
- Notes from VLDB conference☆30Updated 9 years ago
- Probabilistic data structures server. The data model is key-value, where values are: Bloomfilters, LinearCounters, HyperLogLogs, CountMin…☆24Updated 8 years ago
- ☆110Updated 7 years ago
- Big Data Made Easy☆184Updated 6 years ago
- Hadoop mapreduce job to bulk load data into Cassandra☆75Updated 2 years ago
- SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.☆427Updated 8 years ago
- Distributed Matrix Library☆70Updated 7 years ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 9 years ago
- Scala stuff☆18Updated 5 years ago
- Very Fast Machine Learning Toolkit☆27Updated 11 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 10 years ago