☆41Jan 24, 2023Updated 3 years ago
Alternatives and similar repositories for docker-spark-airflow
Users that are interested in docker-spark-airflow are comparing it to the libraries listed below
Sorting:
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 3 years ago
- Docker with Airflow and Spark standalone cluster☆263Aug 5, 2023Updated 2 years ago
- Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.☆15Sep 3, 2021Updated 4 years ago
- This is a boilerplate which has dependencies for pyspark(3.3.0) mongo(>4.x) connectivity☆10May 3, 2024Updated last year
- The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.☆51Dec 8, 2022Updated 3 years ago
- ☆19Feb 25, 2022Updated 4 years ago
- Typings for Confluent Kafka Python Client☆27Dec 2, 2025Updated 3 months ago
- Companion repository that goes along with Snowflake's "Advanced Data Engineering with Snowflake" course☆23Apr 23, 2025Updated 10 months ago
- Jupyter notebooks for the teaching of mechanics☆11Oct 8, 2024Updated last year
- A partially implemented ODBC driver for the Trino distributed SQL engine☆18Feb 2, 2026Updated last month
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆101Jun 7, 2024Updated last year
- Distributed data sync using trimerge☆11Mar 26, 2024Updated last year
- ☆12May 22, 2023Updated 2 years ago
- Python implementation of binary max-heaps.☆11Mar 22, 2020Updated 5 years ago
- A Python package extending pandas with helper functions for simpler exploratory data analysis and data wrangling.☆10Feb 6, 2025Updated last year
- Base Kafka Producer, consumer, flask api and PySpark Structured streaming Job☆11Oct 20, 2021Updated 4 years ago
- A Retail store management system - DBMS project (Sep 2015) written in Django (Python)☆11Aug 11, 2020Updated 5 years ago
- Demo of using Airflow☆11Jun 24, 2022Updated 3 years ago
- Roadmap for all those who want to get a kick start as Data Scientist.☆11Feb 2, 2022Updated 4 years ago
- Data Analysis and Image Processing Python Course☆12Nov 4, 2014Updated 11 years ago
- This is the HTML-CSS source code to build my personal website.☆10Nov 13, 2025Updated 3 months ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 4 years ago
- 쌤(SAM)! 도와주세요! 발표 및 블로그 글에 활용된 레포지토리예요.☆10Oct 28, 2023Updated 2 years ago
- Python implementation of EIP 1577 content hash☆16Dec 24, 2023Updated 2 years ago
- ☆10Jan 24, 2023Updated 3 years ago
- A simple CLI command that initialises a Kedro project from an existing Python package☆11Aug 23, 2024Updated last year
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆47Dec 13, 2025Updated 2 months ago
- This is a very basic PoC for a graphical no-code builder that generates solidity smart contract code from a given blockly block.☆10Oct 22, 2020Updated 5 years ago
- The "World Data Report" is a Power BI project that offers a detailed overview of global data, covering weather, geographical, demographic…☆15Nov 30, 2025Updated 3 months ago
- Data Guy Story commandline☆11Dec 2, 2022Updated 3 years ago
- ☆13Jun 21, 2021Updated 4 years ago
- ☆12Apr 21, 2024Updated last year
- automagically fixes simple flake8 lints☆15Jun 26, 2024Updated last year
- Automatic inline image toggling as the cursor enters and exits them☆17Apr 18, 2023Updated 2 years ago
- ☆12Feb 24, 2026Updated last week
- ☆11Mar 14, 2023Updated 2 years ago
- API for parse.com in node.js☆44Aug 16, 2016Updated 9 years ago
- AutoML 2024: HPOD: Hyperparameter Optimization for Unsupervised Outlier Detection☆12Jul 12, 2024Updated last year
- Website for the vetiver 🏺 framework☆13May 28, 2025Updated 9 months ago