gakhov / pdsaLinks
Probabilistic Data Structures and Algorithms in Python
☆130Updated 5 years ago
Alternatives and similar repositories for pdsa
Users that are interested in pdsa are comparing it to the libraries listed below
Sorting:
- Core C++ Sketch Library☆238Updated last week
- A polystore database from researchers of the Intel Science and Technology Center for Big Data☆38Updated 2 years ago
- Distribution transparent Machine Learning experiments on Apache Spark☆91Updated last year
- Sketching linear classifiers over data streams with the Weight-Median Sketch (SIGMOD 2018).☆39Updated 7 years ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆302Updated last year
- Python bindings to Succinct Data Structure Library 2.0☆33Updated 6 years ago
- Flow with FlorDB 🌻☆154Updated 2 months ago
- PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp☆86Updated last month
- A Scalable Auto-ML System☆53Updated 2 years ago
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated 2 years ago
- Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)☆117Updated 3 months ago
- Python implementations of the distributed quantile sketch algorithm DDSketch☆88Updated 3 months ago
- Website for DataSketches.☆104Updated last month
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆63Updated 10 months ago
- an anagram☆136Updated 4 years ago
- Apache datasketches☆98Updated 2 years ago
- Ray-based Apache Beam runner☆41Updated last year
- Distributed SQL Engine in Python using Dask☆407Updated 11 months ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 8 months ago
- A database with automatic dynamic imputation of missing values.☆11Updated 7 years ago
- In-Memory Analytics with Apache Arrow, published by Packt☆103Updated last year
- Apache datasketches☆34Updated 2 weeks ago
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆235Updated 3 weeks ago
- ☆36Updated 2 years ago
- ☆79Updated 2 years ago
- Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.☆250Updated 2 weeks ago
- A platform for online learning that curtails data latency and saves you cost.☆47Updated 3 years ago
- The stupidest database of all time.☆55Updated last week
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago