frischHWC / datagenLinks
Datagenerator for Data Services
☆16Updated 4 months ago
Alternatives and similar repositories for datagen
Users that are interested in datagen are comparing it to the libraries listed below
Sorting:
- One Click Script to Deploy CDP (CDP PvC & HDP & CDH)☆32Updated 4 months ago
- Kerberos and Hadoop: The Madness beyond the Gate☆282Updated 2 years ago
- Port of TPC-DS dsdgen to Java☆50Updated last year
- Prerequisites checker for Cloudera Manager and CDP PVC Base installations☆58Updated 2 years ago
- Cloudera Manager Extensibility Tools and Documentation.☆193Updated 2 years ago
- Edge2AI Workshop☆70Updated 7 months ago
- ☆27Updated 2 years ago
- Example to create lineage in Atlas with sqoop and spark☆14Updated 8 years ago
- Cloudera deployment automation with Ansible☆200Updated 5 years ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆132Updated 3 weeks ago
- Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and…☆667Updated 3 months ago
- A collection of templates for use with Apache NiFi.☆278Updated 9 years ago
- Hive for MR3☆38Updated this week
- Tutorial on how to setup Trino and Apache Ranger using docker☆41Updated last year
- Hadoop FSImage Analyzer (HFSA)☆66Updated this week
- ACID Data Source for Apache Spark based on Hive ACID☆96Updated 4 years ago
- Data Lineage Tracking And Visualization Solution☆652Updated this week
- Spline agent for Apache Spark☆201Updated last week
- Collection of tools for bootstrapping Apache Ambari & deploying clusters☆83Updated 6 years ago
- TPC-DS Kit for Impala☆170Updated last year
- Remedy small files by combining them into larger ones.☆23Updated 7 years ago
- Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.☆285Updated 2 months ago
- Cloudera CDP SDK for Java☆15Updated 3 weeks ago
- Ambari stack service for installing and managing Apache Airflow on HDP cluster☆58Updated 7 years ago
- A Spark Atlas connector to track data lineage in Apache Atlas☆264Updated 3 years ago
- hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE☆296Updated 3 years ago
- Cloudera Manager API Client☆308Updated 2 years ago
- Ansible playbooks for deploying Hortonworks Data Platform and DataFlow using Ambari Blueprints☆248Updated 4 years ago
- Install Ambari 2.7.5 with HDP 3.1.4 without using Hortonworks repositories.☆48Updated 4 years ago
- A slightly moist lipstick-on-pig clone for Apache Hive☆23Updated 2 years ago