h2oai / db-benchmark
reproducible benchmark of database-like ops
☆329Updated last year
Alternatives and similar repositories for db-benchmark:
Users that are interested in db-benchmark are comparing it to the libraries listed below
- reproducible benchmark of database-like ops☆153Updated last week
- Helpers for Arrow C Data & Arrow C Stream interfaces☆178Updated this week
- Apache Arrow Cookbook☆99Updated this week
- Apache DataFusion Python Bindings☆407Updated last week
- Database connectivity API standard and libraries for Apache Arrow☆398Updated this week
- RFC document, tooling and other content related to the dataframe API standard☆105Updated 10 months ago
- Ibis Substrait Compiler☆98Updated this week
- IbisML is a library for building scalable ML pipelines using Ibis.☆96Updated last month
- Distributed SQL Engine in Python using Dask☆398Updated 5 months ago
- Pandas ExtensionDType/Array backed by Apache Arrow☆229Updated last year
- Book documentation of the Polars DataFrame library☆186Updated last year
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆227Updated 4 months ago
- Python binding for DataFusion☆59Updated 2 years ago
- Distributed SQL Query Engine in Python using Ray☆242Updated 3 months ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆298Updated 7 months ago
- Language-independent Continuous Benchmarking (CB) Framework☆105Updated 5 months ago
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆626Updated this week
- Apache Arrow Flight SQL adapter for PostgreSQL☆73Updated 3 weeks ago
- Polars extension for general data science use cases☆429Updated this week
- Quickly view your data☆294Updated 2 weeks ago
- Automatically upgrade your Polars code to use the latest syntax available☆62Updated 7 months ago
- Fuzzy Data Benchmark☆17Updated 11 months ago
- High performance Python GLMs with all the features!☆321Updated 2 weeks ago
- LETSQL is a deferred compute system focused on intelligent composition of AI pipelines. Optimize performance with cross-engine caching an…☆90Updated this week
- Polars plugin offering eXtra stuff for DateTimes☆194Updated last month
- Serverside scaling for Vega and Altair visualizations☆344Updated 2 months ago
- A purely experimental DuckDB Deltalake extension☆94Updated this week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated 10 months ago
- In-Memory Analytics with Apache Arrow, published by Packt☆94Updated last year
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago