suhailrehman / fuzzydataLinks
Fuzzy Data Benchmark
☆17Updated last year
Alternatives and similar repositories for fuzzydata
Users that are interested in fuzzydata are comparing it to the libraries listed below
Sorting:
- Ibis Substrait Compiler☆108Updated last week
- A Python-to-SQL transpiler as replacement for Python Pandas☆49Updated 3 years ago
- Dias: Dynamic Rewriting of Pandas Code☆79Updated 5 months ago
- RFC document, tooling and other content related to the dataframe API standard☆108Updated last year
- The SQL Standards Project aims to create consensus in SQL semantics☆47Updated last year
- ☆53Updated last week
- ☆34Updated 2 years ago
- Unified Distributed Execution☆57Updated last year
- ☆115Updated 2 weeks ago
- Distributed SQL Engine in Python using Dask☆409Updated last year
- A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to you…☆261Updated this week
- Train Gradient Boosting and Random Forest with only SQL (VLDB 2023)☆24Updated 2 years ago
- Apache Arrow PostgreSQL connector☆62Updated last year
- Distributed SQL Query Engine in Python using Ray☆246Updated last year
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- Arrow, pydantic style☆85Updated 3 years ago
- Language-independent Continuous Benchmarking (CB) Framework☆115Updated last year
- Lambda Learner is a library for iterative incremental training of a class of supervised machine learning models.☆41Updated 2 years ago
- A playground for running duckdb as a stateless query engine over a data lake☆217Updated last year
- Prototype compiler from SaneQL to SQL☆86Updated 2 years ago
- A software engineering framework to jump start your machine learning projects☆37Updated last year
- A Python package that parses sql and converts it to ibis expressions☆56Updated 2 years ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated last year
- [SIGMOD 2026] F3: The Open-Source Data File Format for the Future☆299Updated last month
- easy install parquet-tools☆182Updated last year
- A Delta Lake reader for Dask☆53Updated 5 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆119Updated 5 months ago
- Coming soon☆63Updated 2 years ago
- Python binding for DataFusion☆59Updated 3 years ago
- openclean - Data Cleaning and data profiling library for Python☆83Updated 4 years ago