twitter / caladrius
Performance modelling system for Distributed Stream Processing Systems (DSPS) such as Apache Heron and Apache Storm
☆22Updated last year
Related projects ⓘ
Alternatives and complementary repositories for caladrius
- Mirror of Apache Arrow site☆34Updated this week
- Query processing for an extremely simple, in-memory, columnar database using Apache Arrow to represent tables☆22Updated 3 years ago
- Shaded version of Apache Hive for Trino☆9Updated 3 months ago
- Self regulation and auto-tuning for distributed system☆64Updated last year
- Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.☆37Updated last year
- Repository for building CDAP and additional external projects☆15Updated this week
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated 6 months ago
- pulsar lakehouse connector☆30Updated this week
- Lakehouse storage system benchmark☆66Updated last year
- SnailTrail implementation☆38Updated 5 years ago
- Dione - a Spark and HDFS indexing library☆50Updated 8 months ago
- Website for DataSketches.☆95Updated last week
- Albis: High-Performance File Format for Big Data Systems☆21Updated 6 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 3 years ago
- Apache Kyuubi Site☆13Updated 3 weeks ago
- Simple interface to read, organize, and manipulate structured data in files on local and cloud storage☆33Updated last year
- DS2 is an auto-scaling controller for distributed streaming dataflows☆88Updated last year
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- LinkedIn's version of Apache Calcite☆22Updated 2 weeks ago
- ☆15Updated 5 months ago
- Fast I/O plugins for Spark☆41Updated 3 years ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 3 years ago
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated 5 months ago
- A tool to install, configure and manage Trino installations☆27Updated 2 years ago
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆13Updated last week
- Apache Iceberg Documentation Site☆42Updated 9 months ago
- Alluxio Python client - Access Any Data Source with Python☆26Updated 3 weeks ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Updated 4 years ago
- Apache Arrow Cookbook☆96Updated 3 weeks ago
- struct2tensor is a library for parsing and manipulating structured data inside of tensorflow.☆34Updated last month