brunocfnba / docker-spark-cluster
Set up a 3 node spark cluster using docker containers
☆33Updated 6 years ago
Related projects: ⓘ
- Data validation library for PySpark 3.0.0☆34Updated last year
- spark on kubernetes☆105Updated last year
- Repo for all my code on the articles I post on medium☆105Updated last year
- An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR☆171Updated 10 months ago
- ☆71Updated 3 years ago
- Docker container for Kafka - Spark Streaming - Cassandra☆98Updated 5 years ago
- Asynchronous actions for PySpark☆44Updated 2 years ago
- Code Repository for the EVO-ODAS☆31Updated 6 years ago
- Deploy your Spark Production Cluster on Kubernetes☆47Updated 4 years ago
- Apache Spark docker container image (Standalone mode)☆36Updated 3 years ago
- Use Airflow to move data from multiple MySQL databases to BigQuery☆99Updated 4 years ago
- Spark package for checking data quality☆25Updated last year
- Real-world Spark pipelines examples☆83Updated 6 years ago
- ☆37Updated 5 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 5 years ago
- pyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4☆69Updated last year
- A tutorial on Apache Spark Unit Testing☆37Updated 8 years ago
- A Spark cluster setup running on Docker containers☆60Updated 4 years ago
- ☆33Updated 4 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 3 years ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆75Updated 4 months ago
- Repository used for Spark Trainings☆53Updated last year
- These are some code examples☆55Updated 4 years ago
- Example unit tests for Apache Spark Python scripts using the py.test framework☆85Updated 8 years ago
- Machine Learning Stack for Big Data, Big Cluster and Big Challenges☆22Updated 6 years ago
- ☆63Updated 4 years ago
- Various Demos mostly based on docker environments☆33Updated last year
- Python API for Deequ☆41Updated 3 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆96Updated last year