bikash / DataQualityLinks
Tutorial and examples of Data Quality in Big Data System
☆12Updated 8 years ago
Alternatives and similar repositories for DataQuality
Users that are interested in DataQuality are comparing it to the libraries listed below
Sorting:
- Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API in…☆19Updated 5 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆153Updated this week
- The Taxonomy for ETL Automation Metadata (TEAM) is a tool for design metadata management geared towards data warehouse automation. It is …☆36Updated 4 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 3 years ago
- DataQuality for BigData☆144Updated last year
- a set of scripts to pull meta data and data profiling metrics from relational database systems☆77Updated last year
- ☆39Updated 6 years ago
- Tool to automate data quality checks on data pipelines☆255Updated 2 years ago
- DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, aud…☆27Updated 3 weeks ago
- A proof of concept using Divolte, Kafka, Druid and Superset☆62Updated 5 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 4 years ago
- TinyOlap is a light-weight, in-process, in-memory, multi-dimensional, model-first OLAP engine for planning, budgeting, reporting, analysi…☆48Updated 3 years ago
- Mirror of Apache Arrow☆32Updated this week
- Drools processor for Apache NiFi☆38Updated 5 years ago
- Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi☆114Updated last year
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆95Updated last week
- Superset Quick Start Guide, published by Packt☆56Updated last year
- spark-drools tutorials☆16Updated last year
- Egeria's Guidance on Governance as well as large media files such as presentations and movies☆104Updated 2 years ago
- MonitoFi: Health & Performance Monitor for your Apache NiFi☆64Updated last year
- Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clust…☆26Updated last year
- Dremio driver for Metabase BI☆51Updated 7 months ago
- Generic interface exchange format for Data Warehouse Automation and ETL generation.☆41Updated 11 months ago
- Open-source metadata collector based on ODD Specification☆44Updated last year
- Db2 JDBC connector for Trino☆19Updated 2 years ago
- Flink dynamic CEP demo☆19Updated 3 years ago
- Data Lineage Tracing Library☆23Updated 3 years ago
- XML/A engine for real-time OLAP analytics☆48Updated 8 years ago
- Collection of examples integrating NiFi with stream process frameworks.☆59Updated 8 years ago