suhailrehman / fuzzydataLinks
Fuzzy Data Benchmark
☆17Updated last year
Alternatives and similar repositories for fuzzydata
Users that are interested in fuzzydata are comparing it to the libraries listed below
Sorting:
- Ibis Substrait Compiler☆103Updated last week
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- RFC document, tooling and other content related to the dataframe API standard☆110Updated last year
- The SQL Standards Project aims to create consensus in SQL semantics☆46Updated 8 months ago
- ☆33Updated last year
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆230Updated last week
- Train Gradient Boosting and Random Forest with only SQL (VLDB 2023)☆23Updated last year
- Apache Arrow PostgreSQL connector☆61Updated last year
- ☆45Updated 2 weeks ago
- Distributed SQL Engine in Python using Dask☆406Updated 10 months ago
- reproducible benchmark of database-like ops☆339Updated 2 years ago
- Arrow, pydantic style☆84Updated 2 years ago
- reproducible benchmark of database-like ops☆169Updated 3 weeks ago
- easy install parquet-tools☆180Updated last year
- ☆79Updated 2 years ago
- Documentation for Hyper, the blazingly fast SQL engine powering analytics at Tableau and Salesforce☆30Updated 2 weeks ago
- Helpers for Arrow C Data & Arrow C Stream interfaces☆193Updated last week
- ☆146Updated 3 months ago
- Template for DuckDB extensions to help you develop, test and deploy a custom extension☆208Updated last week
- ☆99Updated 2 weeks ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated last year
- Unified Distributed Execution☆54Updated 8 months ago
- Language-independent Continuous Benchmarking (CB) Framework☆109Updated 10 months ago
- Distributed SQL Query Engine in Python using Ray☆243Updated 9 months ago
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Python binding for DataFusion☆59Updated 2 years ago
- Data pipelines from re-usable components☆108Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Flow with FlorDB 🌻☆154Updated last month
- DuckDB is an in-process SQL OLAP Database Management System☆44Updated 3 weeks ago