mirkoprescha / spark-zeppelin-docker
docker image with spark and zeppelin
☆12Updated 5 years ago
Alternatives and similar repositories for spark-zeppelin-docker:
Users that are interested in spark-zeppelin-docker are comparing it to the libraries listed below
- Hadoop Data Pipeline using Falcon☆15Updated 9 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 8 months ago
- Apache NiFi NLP Processor☆18Updated last year
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- 🚚 ETL for Spark and Airflow☆25Updated 7 years ago
- Example project showing how to use Hive UDFs in Apache Spark☆55Updated 6 years ago
- Simple Spark example of generating table stats for use of data quality checks☆28Updated 8 years ago
- ☆26Updated 4 years ago
- A sample implementation of the Spark Datasource API☆23Updated 8 years ago
- ☆10Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- The iterative broadcast join example code.☆69Updated 7 years ago
- UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions☆12Updated 5 years ago
- ☆16Updated 2 years ago
- Docker image for Apache Spark☆76Updated 5 years ago
- type-class based data cleansing library for Apache Spark SQL☆78Updated 5 years ago
- Sample processing code using Spark 2.1+ and Scala☆52Updated 4 years ago
- Real-world Spark pipelines examples☆83Updated 7 years ago
- ☆72Updated 4 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last year
- Profiles the data, validates the schema and runs data quality checks and produces a report☆20Updated 5 years ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- Avro record class and reader generator☆20Updated 2 years ago
- Support Highcharts in Apache Zeppelin☆81Updated 7 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Collection of HDP Tuning Tricks & Tips (unofficial guide)☆17Updated 7 years ago
- Code to index Hive tables to Solr and Solr indexes to Hive☆48Updated 5 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago