streamsets / tutorialsLinks
StreamSets Tutorials
☆351Updated last year
Alternatives and similar repositories for tutorials
Users that are interested in tutorials are comparing it to the libraries listed below
Sorting:
- Apache NiFi example flows☆207Updated 5 years ago
- A collection of templates for use with Apache NiFi.☆280Updated 8 years ago
- Cloudera Manager API Client☆308Updated last year
- A Spark Atlas connector to track data lineage in Apache Atlas☆266Updated 2 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Updated 3 years ago
- DataQuality for BigData☆144Updated last year
- Ambari service for Apache Flink☆127Updated 4 years ago
- Ambari stack service for easily installing and managing Hue on HDP cluster☆107Updated 6 years ago
- This Apache Atlas is built from the latest release source tarball and patched to be run in a Docker container.☆143Updated last year
- Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version☆271Updated last year
- Cloudera Manager Extensibility Tools and Documentation.☆190Updated last year
- A Maven-based example of using Cloudera Impala's JDBC driver☆118Updated 9 years ago
- Ambari service for Presto☆44Updated 8 months ago
- Mirror of Apache Knox☆206Updated this week
- Mirror of Apache Sentry☆120Updated 5 years ago
- Mirror of Apache Atlas (Incubating)☆95Updated 2 years ago
- A tool to install, configure and manage Presto installations☆171Updated 2 years ago
- Dockerfiles for StreamSets Data Collector☆114Updated 7 months ago
- Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and…☆657Updated last month
- Few scripts to automate daily data loads from RDBMS to Partitioned Avro Hive table☆30Updated 11 years ago
- High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper.…☆633Updated 3 years ago
- An open source framework for building data analytic applications.☆784Updated this week
- ☆240Updated 4 years ago
- CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and a…☆361Updated this week
- Presto Plugin for Oracle JDBC Connection☆43Updated 2 years ago
- Kettle plugin that provides support for interacting within many "big data" projects including Hadoop, Hive, HBase, Cassandra, MongoDB, an…☆238Updated this week
- Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…☆1,114Updated 2 years ago
- Sparkline BI Accelerator provides fast ad-hoc query capability over Logical Cubes. This has been folded into our SNAP Platform(http://bit…☆282Updated 7 years ago
- Apache Flink docker image☆195Updated 3 years ago
- Docker image with Ambari☆291Updated 7 years ago