MarquezProject / marquez-web
Marquez Web UI
☆22Updated 4 years ago
Related projects ⓘ
Alternatives and complementary repositories for marquez-web
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 7 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated 8 months ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Cask Hydrator Plugins Repository☆67Updated 3 weeks ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- ☆13Updated last week
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Use SQL to transform your avro schema/records☆28Updated 6 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆48Updated 10 months ago
- Apache Beam Site☆29Updated this week
- Dione - a Spark and HDFS indexing library☆50Updated 8 months ago
- Mutation testing framework and code coverage for Hive SQL☆24Updated 3 years ago
- Quark is a data virtualization engine over analytic databases.☆99Updated 7 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 3 years ago
- Common components used across the datamountaineer kafka connect connectors☆21Updated 3 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated last year
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆75Updated 5 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆71Updated last year
- Extensions available for use in Apiary☆10Updated 2 months ago
- JDBC driver for Apache Kafka☆87Updated 2 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 9 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Apache Spark ETL Utilities☆40Updated 3 weeks ago
- A facebook for data☆26Updated 5 years ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- A Spark datasource for the HadoopOffice library☆39Updated 2 years ago
- Data pipeline automation tool☆25Updated 10 months ago