apache / datasketches-python
Apache datasketches
☆24Updated this week
Alternatives and similar repositories for datasketches-python:
Users that are interested in datasketches-python are comparing it to the libraries listed below
- PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp☆86Updated this week
- Core C++ Sketch Library☆229Updated this week
- ☆20Updated last year
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆173Updated this week
- Ibis Substrait Compiler☆98Updated this week
- A Python-to-SQL transpiler as replacement for Python Pandas☆48Updated 2 years ago
- Code and Benchmarks for JOSIE (SIGMOD 2019)☆18Updated last year
- Apache Arrow PostgreSQL connector☆57Updated 11 months ago
- DuckDB extension that adds support for SQL/PGQ and graph algorithms☆106Updated this week
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated 11 months ago
- ☆30Updated this week
- ☆11Updated last year
- Redset is a dataset containing three months worth of user query metadata that ran on a selected sample of instances in the Amazon Redshif…☆48Updated 4 months ago
- Train Gradient Boosting and Random Forest with only SQL (VLDB 2023)☆21Updated last year
- ☆192Updated last week
- Apache Arrow Flight SQL adapter for PostgreSQL☆72Updated 2 weeks ago
- A playground for running duckdb as a stateless query engine over a data lake☆184Updated last year
- ☆65Updated 5 months ago
- Point-in-Time optimizations for Apache Spark☆29Updated last year
- DuckDB is an in-process SQL OLAP Database Management System☆41Updated this week
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆27Updated 2 years ago
- Distributed Bayesian Entity Resolution in Apache Spark☆57Updated 3 years ago
- Template for DuckDB extensions to help you develop, test and deploy a custom extension☆160Updated last week
- ☆68Updated 2 weeks ago
- Integrates DuckDB with Google BigQuery, allowing direct querying and management of BigQuery datasets☆83Updated last week
- Demo repository to lambda-fy your dbt runs☆11Updated last year
- An experimental Athena extension for DuckDB 🐤☆51Updated 2 weeks ago
- reproducible benchmark of database-like ops☆152Updated 2 months ago
- Graph Engine for Exploration and Search☆40Updated 11 months ago
- A write-audit-publish implementation on a data lake without the JVM☆45Updated 5 months ago