data-integrations / wrangler
Wrangler Transform: A DMD system for transforming Big Data
☆91Updated this week
Alternatives and similar repositories for wrangler:
Users that are interested in wrangler are comparing it to the libraries listed below
- Cask Hydrator Plugins Repository☆67Updated this week
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆142Updated 8 months ago
- ☆39Updated 5 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- Quark is a data virtualization engine over analytic databases.☆98Updated 7 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 10 months ago
- DBeam exports SQL tables into Avro files using JDBC and Apache Beam☆194Updated this week
- A collection of Google Cloud Platform (GCP) plugins☆45Updated this week
- Apache Beam Site☆29Updated this week
- ☆65Updated 5 months ago
- CDAP UI☆19Updated this week
- Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP☆90Updated 6 months ago
- ☆81Updated last year
- Collection of examples integrating NiFi with stream process frameworks.☆56Updated 8 years ago
- ACID Data Source for Apache Spark based on Hive ACID☆97Updated 3 years ago
- ☆13Updated this week
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 3 years ago
- CDAP Applications☆43Updated 7 years ago
- ☆46Updated 9 months ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- A library for Spark DataFrame using MinIO Select API☆97Updated 5 years ago
- A utility for generating Oozie workflows from a YAML definition☆48Updated 5 years ago
- SQL data model for working with Snowplow web data. Supports Redshift and Looker. Snowflake and BigQuery coming soon☆60Updated 4 years ago
- Apache Spark ETL Utilities☆40Updated 3 months ago
- Snowflake Data Source for Apache Spark.☆224Updated 2 months ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago
- Sample code with integration between Data Catalog and BI data sources.☆32Updated 3 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆159Updated 2 years ago
- ☆127Updated 9 months ago
- Sample code with integration between Data Catalog and RDBMS data sources.☆72Updated 3 years ago