gakhov / pdsaLinks
Probabilistic Data Structures and Algorithms in Python
☆130Updated 5 years ago
Alternatives and similar repositories for pdsa
Users that are interested in pdsa are comparing it to the libraries listed below
Sorting:
- Core C++ Sketch Library☆239Updated last month
- Python implementations of the distributed quantile sketch algorithm DDSketch☆88Updated 4 months ago
- Distribution transparent Machine Learning experiments on Apache Spark☆91Updated last year
- Flow with FlorDB 🌻☆154Updated 2 weeks ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆302Updated last year
- ☆36Updated 2 years ago
- Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)☆117Updated 2 weeks ago
- An implementation of the Random Cut Forest data structure for sketching streaming data, with support for anomaly detection, density estim…☆228Updated this week
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated 2 years ago
- Juho Hirvonen and Jukka Suomela: Distributed Algorithms 2020☆73Updated last week
- Website for DataSketches.☆104Updated last month
- Python bindings to Succinct Data Structure Library 2.0☆33Updated 6 years ago
- A Scalable Auto-ML System☆53Updated 2 years ago
- Apache datasketches☆34Updated last month
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- In-Memory Analytics with Apache Arrow, published by Packt☆104Updated last week
- Ray-based Apache Beam runner☆41Updated 2 years ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆63Updated 11 months ago
- Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.☆252Updated last week
- Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html☆121Updated this week
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 9 months ago
- A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: Hype…☆159Updated 3 months ago
- Friendly ML feature store☆45Updated 3 years ago
- Search for similar short strings☆53Updated 4 years ago
- Avro2TF is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.☆128Updated 5 years ago
- an anagram☆136Updated 4 years ago
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing☆250Updated 4 years ago
- Willump Is a Low-Latency Useful Machine learning Platform.☆44Updated 2 years ago
- Serverless Scientific Computing☆87Updated 6 years ago