amazon-science / redsetLinks
Redset is a dataset containing three months worth of user query metadata that ran on a selected sample of instances in the Amazon Redshift fleet. We provide query metadata for 200 provisioned and serverless instances each.
☆62Updated 11 months ago
Alternatives and similar repositories for redset
Users that are interested in redset are comparing it to the libraries listed below
Sorting:
- BI benchmark with user generated data and queries☆70Updated 8 months ago
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆251Updated 4 months ago
- Apache DataFusion Benchmarks☆20Updated 4 months ago
- ☆47Updated last month
- Ibis Substrait Compiler☆104Updated this week
- ☆302Updated this week
- Next-Gen Big Data File Format☆440Updated 3 weeks ago
- Code repo for "An Empirical Evaluation of Columnar Storage Formats" VLDB Vol 17☆62Updated last year
- AnyBlob - A Universal Cloud Object Storage Download Manager Built For Cost-Throughput Optimal Analytics!☆136Updated 3 weeks ago
- New file format for storage of large columnar datasets.☆586Updated 2 weeks ago
- InkFuse - An Experimental Database Runtime Unifying Vectorized and Compiled Query Execution.☆52Updated last year
- Reference implementations for the LDBC Social Network Benchmark's Business Intelligence (BI) workload☆43Updated 4 months ago
- ☆79Updated 2 years ago
- A benchmark for serverless analytic databases.☆22Updated 11 months ago
- Template for DuckDB extensions to help you develop, test and deploy a custom extension☆213Updated last month
- tpch-dbgen☆38Updated 13 years ago
- Distributed SQL Query Engine in Python using Ray☆244Updated 10 months ago
- TPC-H dbgen☆310Updated last year
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆234Updated 3 weeks ago
- DuckDB extension allowing reading/writing of vortex files☆15Updated last week
- Collection of experiments to carve out the differences between two types of relational query processing engines: Vectorizing (interpretat…☆261Updated 7 years ago
- Apache Iceberg C++☆106Updated last week
- PRISM is a UDF optimization framework that deconstructs a UDF into separate inlinable and outlinable pieces, resulting in simpler queries…☆17Updated last week
- Core C++ Sketch Library☆237Updated last week
- In-Memory Analytics with Apache Arrow, published by Packt☆103Updated last year
- Apache DataFusion Ray☆217Updated 2 weeks ago
- Pollock is a benchmark for data loading on character-delimited files.☆20Updated 4 months ago
- TPC-DS queries☆62Updated 10 years ago
- Snowflake dataset containing statistics for 70 million queries over 14 day period☆113Updated 3 years ago
- Reproducing TPC-DS qualification/reference results☆32Updated 2 years ago