gakhov / pdsa
Probabilistic Data Structures and Algorithms in Python
☆121Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for pdsa
- Core C++ Sketch Library☆225Updated 3 weeks ago
- Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)☆111Updated last month
- Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html☆112Updated this week
- Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.☆239Updated 2 weeks ago
- A General-Purpose Counting Filter: Counting Quotient Filter☆126Updated last year
- Readings in Stream Processing☆119Updated this week
- Python library for handling efficiently sorted integer sets.☆205Updated 2 months ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆298Updated 5 months ago
- MonetDBLite as a Python Package☆32Updated 2 years ago
- Sketching linear classifiers over data streams with the Weight-Median Sketch (SIGMOD 2018).☆38Updated 6 years ago
- Python bindings to Succinct Data Structure Library 2.0☆30Updated 5 years ago
- Website for DataSketches.☆95Updated this week
- Reference implementations of sliding window aggregation algorithms☆43Updated last year
- 🐍 Python library implementing sorted containers with state-of-the-art query performance and compressed memory usage☆207Updated 7 months ago
- Distribution transparent Machine Learning experiments on Apache Spark☆90Updated 9 months ago
- PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp☆85Updated 8 months ago
- A Python-to-SQL transpiler as replacement for Python Pandas☆47Updated last year
- Paper Summaries☆55Updated 4 years ago
- Fast HyperLogLog for Python.☆100Updated 2 months ago
- Point-in-Time optimizations for Apache Spark☆29Updated 10 months ago
- Paper about the estimation of cardinalities from HyperLogLog sketches☆61Updated 3 years ago
- Juho Hirvonen and Jukka Suomela: Distributed Algorithms 2020☆68Updated this week
- An HPC Interface for data analysis platforms☆22Updated 4 years ago
- A composable framework for fast and scalable data analytics☆57Updated last year
- A work-in-progress book on Dask☆12Updated last year
- Python implementations of the distributed quantile sketch algorithm DDSketch☆85Updated 2 months ago
- Your worst case is our best case.☆139Updated 7 years ago
- A Scalable Auto-ML System☆51Updated last year
- Multi-core Window-Based Stream Processing Engine☆70Updated 3 years ago