gakhov / pdsaLinks
Probabilistic Data Structures and Algorithms in Python
☆129Updated 5 years ago
Alternatives and similar repositories for pdsa
Users that are interested in pdsa are comparing it to the libraries listed below
Sorting:
- Python bindings for xorfilter(faster and smaller than bloom and cuckoo filters)☆116Updated last week
- Fast HyperLogLog for Python.☆106Updated 4 months ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated 11 months ago
- Core C++ Sketch Library☆233Updated last week
- Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html☆117Updated last week
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆42Updated 2 years ago
- Multi-core Window-Based Stream Processing Engine☆71Updated 3 years ago
- Keyvi - the key value index. It is an in-memory FST-based data structure highly optimized for size and lookup performance.☆245Updated this week
- Source code for the split annotations project.☆53Updated 2 years ago
- Python bindings to Succinct Data Structure Library 2.0☆31Updated 6 years ago
- A polystore database from researchers of the Intel Science and Technology Center for Big Data☆38Updated 2 years ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆62Updated 8 months ago
- Python implementations of the distributed quantile sketch algorithm DDSketch☆87Updated last month
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 5 months ago
- Code repo for "An Empirical Evaluation of Columnar Storage Formats" VLDB Vol 17☆56Updated last year
- Flow with FlorDB 🌻☆155Updated last week
- Materials for Apache Arrow workshop at VLDB 2019☆42Updated 4 years ago
- The Internals of PySpark☆26Updated 5 months ago
- MonetDBLite as a Python Package☆32Updated 3 years ago
- Ibis Substrait Compiler☆102Updated last week
- Moments Sketch Code☆40Updated 6 years ago
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- Enabling queries on compressed data.☆279Updated last year
- Ray-based Apache Beam runner☆42Updated last year
- Apache datasketches☆95Updated 2 years ago
- Cuckoo Index: A Lightweight Secondary Index Structure☆129Updated 3 years ago
- Parameterless and Universal FInding of Nearest Neighbors☆60Updated 2 months ago
- Unified Distributed Execution☆53Updated 7 months ago
- A General-Purpose Counting Filter: Counting Quotient Filter☆127Updated last year
- A collection of libraries for single-pass, distributed, sublinear-space approximate aggregation and sketching algorithms. Currently: Hype…☆156Updated 2 weeks ago