apache / orc-formatLinks
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
☆13Updated 2 weeks ago
Alternatives and similar repositories for orc-format
Users that are interested in orc-format are comparing it to the libraries listed below
Sorting:
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated 2 years ago
- All the things about TPC-DS in Apache Spark☆107Updated 2 years ago
- An Extensible Data Skipping Framework☆47Updated last month
- Lakehouse storage system benchmark☆76Updated 2 years ago
- Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.☆257Updated 2 years ago
- A re-implementation of Hadoop DistCP in Apache Spark☆47Updated last year
- ☆65Updated last year
- Java bindings for https://github.com/facebookincubator/velox☆33Updated this week
- ☆90Updated last week
- Port of TPC-DS dsdgen to Java☆21Updated 2 years ago
- A Persistent Key-Value Store designed for Streaming processing☆106Updated 5 months ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆128Updated 8 months ago
- DS2 is an auto-scaling controller for distributed streaming dataflows☆90Updated 2 years ago
- ☆35Updated last year
- Apache datasketches☆98Updated 2 years ago
- Mirror of Apache crail (Incubating)☆150Updated 3 years ago
- Apache Paimon Python The Python implementation of Apache Paimon.☆16Updated last month
- Apache Iceberg C++☆110Updated this week
- Spark Shuffle Optimization with RDMA+AEP☆30Updated 2 years ago
- SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.☆134Updated 2 years ago
- A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.☆28Updated this week
- Gluten: Plugin to Boost Trino's Performance☆75Updated last year
- Client libraries of end users of Apache Kyuubi☆11Updated 2 years ago
- Cache File System optimized for columnar formats and object stores☆185Updated 3 years ago
- Mirror of Apache Omid Incubator☆90Updated 3 weeks ago
- ☆33Updated 3 months ago
- A series of Jupyter notebooks to demonstrate the functionality of Apache Calcite☆59Updated 5 years ago
- Remote shuffle service for Apache Spark to store shuffle data on remote servers.☆329Updated last year
- Uniffle is a high performance, general purpose Remote Shuffle Service.☆426Updated last week
- Benchmarks for queries over continuous data streams.☆358Updated 8 months ago