An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.
☆43Mar 1, 2026Updated last week
Alternatives and similar repositories for rovio-ingest
Users that are interested in rovio-ingest are comparing it to the libraries listed below
Sorting:
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 5 months ago
- Parallel Streaming Transformation Loader☆10Apr 23, 2019Updated 6 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Jan 22, 2024Updated 2 years ago
- UI for mondrian-rest☆20Apr 17, 2019Updated 6 years ago
- ☆19Sep 8, 2017Updated 8 years ago
- ☆22Jul 2, 2025Updated 8 months ago
- Kanto☆26Nov 5, 2024Updated last year
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- Template for multi-modal machine learning in healthcare using Kedro. Combine reports, tabular data and images using various fusion method…☆24Mar 21, 2025Updated 11 months ago
- XML for Analysis (XMLA) server based upon an olap4j connection☆23Dec 8, 2016Updated 9 years ago
- Plugin for Cura slicer☆10Feb 1, 2023Updated 3 years ago
- Druid indexing plugin for using Spark in batch jobs☆101Oct 21, 2021Updated 4 years ago
- ☆38May 22, 2024Updated last year
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Jun 7, 2021Updated 4 years ago
- This is the GitHub meeting point for Dr. Campbell's Digital Humanities classes at Pitt-Greensburg.☆20Feb 8, 2026Updated last month
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- ☆10Jun 29, 2021Updated 4 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29May 15, 2020Updated 5 years ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆76Apr 24, 2024Updated last year
- An adhoc reporting client based on Pentaho Metadata Layer☆32Mar 20, 2013Updated 12 years ago
- A big data cluster management tool that creates and manages clusters of different technologies.☆21Apr 20, 2015Updated 10 years ago
- ☆12Updated this week
- This is a complete suite of spring boot couchbase and kafka☆12Dec 10, 2018Updated 7 years ago
- Apache Spark based framework for analysis A/B experiments☆15Nov 3, 2024Updated last year
- Visual tool for SPARQL queries on graphol graphs☆10Oct 3, 2018Updated 7 years ago
- A web server to generate ER diagrams☆34Mar 11, 2019Updated 6 years ago
- GRASS GIS module for wildfire simulation wrapping r.ros and r.spread modules☆11Dec 13, 2021Updated 4 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated this week
- CODO is an ontology for the semantic representation and annotation of COVID-19 data in a machine-readable form for tracking history of th…☆10Apr 19, 2022Updated 3 years ago
- Apache Calcite Tutorial☆33Jun 24, 2016Updated 9 years ago
- ☆10Jan 18, 2023Updated 3 years ago
- A simple Django middleware for submitting timings and exceptions to Datadog.☆13Jun 26, 2017Updated 8 years ago
- a simple lakeFS webhook for pre-commit and pre-merge validation of data objects☆12Nov 9, 2023Updated 2 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Online voter registration (OVR) application-as-a-service for 3rd party registrar organizations.☆11Feb 23, 2026Updated 2 weeks ago
- The purpose of this module is to provide ready to use user-data file for Hetzner cloud servers with multiple network managers.☆15Mar 24, 2025Updated 11 months ago
- ☆15Feb 25, 2026Updated last week