gakhov / pdsaLinks
Probabilistic Data Structures and Algorithms in Python
☆129Updated 5 years ago
Alternatives and similar repositories for pdsa
Users that are interested in pdsa are comparing it to the libraries listed below
Sorting:
- Python bindings to Succinct Data Structure Library 2.0☆32Updated 6 years ago
- Core C++ Sketch Library☆234Updated last month
- Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)☆116Updated last month
- Flow with FlorDB 🌻☆154Updated last month
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated 2 years ago
- Parameterless and Universal FInding of Nearest Neighbors☆60Updated 4 months ago
- Sketching linear classifiers over data streams with the Weight-Median Sketch (SIGMOD 2018).☆39Updated 6 years ago
- Distribution transparent Machine Learning experiments on Apache Spark☆91Updated last year
- Python implementations of the distributed quantile sketch algorithm DDSketch☆87Updated 2 months ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 7 months ago
- A Scalable Auto-ML System☆53Updated 2 years ago
- Website for DataSketches.☆102Updated last month
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated last year
- Python bindings for the fast integer compression library FastPFor.☆60Updated last year
- The bare necessities of Pandas on the Weld runtime☆14Updated 2 years ago
- ☆36Updated last year
- Moments Sketch Code☆40Updated 6 years ago
- An open source ML system for the end-to-end data science lifecycle☆37Updated 4 years ago
- Application of Locality Sensitive Hashing to Audio Fingerprinting☆59Updated 7 years ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆62Updated 9 months ago
- A database with automatic dynamic imputation of missing values.☆11Updated 7 years ago
- Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.☆247Updated 3 weeks ago
- Weighted MinHash implementation on CUDA (multi-gpu).☆117Updated last year
- A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: Hype…☆157Updated last month
- Fast HyperLogLog for Python.☆107Updated 6 months ago
- Ray-based Apache Beam runner☆42Updated last year
- MonetDBLite as a Python Package☆32Updated 3 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- 🐍 Python library implementing sorted containers with state-of-the-art query performance and compressed memory usage☆214Updated last year
- Source code for the split annotations project.☆53Updated 2 years ago