RWaltersMA / mongo-spark-jupyterLinks
Docker environment that spins up MongoDB replica set, Spark, and Jupyter Lab. Example code uses PySpark and the MongoDB Spark Connector.
☆40Updated 2 years ago
Alternatives and similar repositories for mongo-spark-jupyter
Users that are interested in mongo-spark-jupyter are comparing it to the libraries listed below
Sorting:
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆119Updated 2 years ago
- Amazon Redshift Cookbook, Published by Packt☆15Updated 2 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆60Updated last year
- A Series of Notebooks on how to start with Kafka and Python☆154Updated 3 months ago
- Docker Airflow - Contains a docker compose file for Airflow 2.0☆67Updated 2 years ago
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- Data lake, data warehouse on GCP☆56Updated 3 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆87Updated 6 years ago
- Materials for the next course☆24Updated 2 years ago
- ☆40Updated 11 months ago
- ☆37Updated 5 years ago
- Materials for the course The Complete Hands-On Introduction to Apache Airflow☆31Updated 2 years ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆40Updated 6 years ago
- Jupyter notebooks for pyspark tutorials given at University☆107Updated 5 months ago
- (project & tutorial) dag pipeline tests + ci/cd setup☆88Updated 4 years ago
- This is my Apache Airflow Local development setup on Windows 10 WSL2/Mac using docker-compose. It will also include some sample DAGs and …☆31Updated last year
- Simple alert system implemented in Kafka and Python☆95Updated 6 years ago
- AWS Glue tutorial for data developers.☆23Updated 5 years ago
- A real-time event pipeline around Kafka Ecosystem for Chicago Transit Authority.☆31Updated last year
- Resources for video demonstrations and blog posts related to DataOps on AWS☆176Updated 3 years ago
- Public source code for the Udemy online course Apache Airflow: Complete Hands-On Beginner to Advanced Class.☆63Updated 4 years ago
- Learn Apache Airflow in easy way☆31Updated 3 years ago
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆159Updated 9 months ago
- Code examples on Apache Spark using python☆107Updated 2 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- Code Repository for AWS Certified Big Data Specialty 2019 - In Depth and Hands On!, published by Packt☆40Updated last year
- ☆180Updated 2 years ago