TileDB-Inc / TileDB-Spark
Spark interface to the TileDB storage manager [please see README]
☆15Updated last month
Related projects ⓘ
Alternatives and complementary repositories for TileDB-Spark
- ☆104Updated last year
- Spark Structured Streaming State Tools☆34Updated 4 years ago
- A benchmark tool for lakehouses.☆11Updated last year
- Library for organizing batch processing pipelines in Apache Spark☆41Updated 7 years ago
- Splittable Gzip codec for Hadoop☆69Updated this week
- JVM integration for Weld☆16Updated 6 years ago
- TileDB integrations for machine learning data and model i/o (PyTorch, TensorFlow, Scikit-Learn)☆23Updated last month
- Miscellaneous functionality for manipulating Apache Spark RDDs.☆22Updated 5 years ago
- Temporal_Graph_library☆25Updated 5 years ago
- Sketch adaptors for Hive.☆49Updated 2 months ago
- Transporter for integrating OpenLineage with OpenMetadata☆12Updated 8 months ago
- A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.☆21Updated this week
- A composable framework for fast and scalable data analytics☆57Updated last year
- This repository provides Scotty, a framework for efficient window aggregations for out-of-order Stream Processing.☆75Updated last year
- A library that brings useful functions from various modern database management systems to Apache Spark☆56Updated last year
- Cloudera CDP SDK for Java☆13Updated last week
- Idempotent query executor☆50Updated 10 months ago
- Apache datasketches☆88Updated last year
- ☆77Updated this week
- Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol☆34Updated 2 years ago
- Milan is a Scala API and runtime infrastructure for building data-oriented systems, built on top of Apache Flink.☆39Updated last year
- Enabling Spark Optimization through Cross-stack Monitoring and Visualization☆47Updated 7 years ago
- Self regulation and auto-tuning for distributed system☆64Updated last year
- A library for exporting Spark ML models and pipelines to PFA☆54Updated 6 years ago
- Mirror of Apache MRQL (Incubating)☆17Updated 7 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 3 years ago
- A tool to get better debug info on spark's memory usage☆42Updated 5 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated 6 months ago
- Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange☆127Updated last month