flaminem / docker-hive-sparkLinks
A docker image with a pre-configured Hive Metastore and a Spark ThriftServer
☆19Updated 5 years ago
Alternatives and similar repositories for docker-hive-spark
Users that are interested in docker-hive-spark are comparing it to the libraries listed below
Sorting:
- A Spark datasource for the HadoopOffice library☆38Updated 2 years ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Postgresql configured to work as metastore for Hive.☆32Updated 2 years ago
- Docker images used internally by various Teradata projects for automation, testing, etc☆40Updated 7 years ago
- Multiple node presto cluster on docker container☆124Updated 2 years ago
- Sample processing code using Spark 2.1+ and Scala☆52Updated 4 years ago
- hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.☆28Updated 7 years ago
- Apache Spark ETL Utilities☆40Updated 7 months ago
- Docker image to submit Spark applications☆38Updated 7 years ago
- A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi☆15Updated 2 years ago
- Parcel for Apache Airflow☆17Updated 5 years ago
- Filling in the Spark function gaps across APIs☆50Updated 4 years ago
- Examples for High Performance Spark☆15Updated 7 months ago
- Spark connector for SFTP☆100Updated 2 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆59Updated last year
- Spark SQL magic command for Jupyter notebooks☆36Updated 4 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆75Updated last year
- A Spark metrics sink that pushes to InfluxDb☆51Updated 4 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last year
- ☆80Updated last month
- ☆25Updated 4 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆98Updated 2 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 5 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆174Updated last week
- Spark metrics related custom classes and sinks (e.g. Prometheus)☆183Updated 2 years ago
- ☆40Updated 2 years ago
- ☆37Updated 6 years ago