☆41Jan 24, 2023Updated 3 years ago
Alternatives and similar repositories for docker-spark-airflow
Users that are interested in docker-spark-airflow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Doing sql in notebooks.☆15Aug 14, 2023Updated 2 years ago
- Project with Airflow + Spark + MinIO + Postgres + Python3.8☆28Sep 9, 2022Updated 3 years ago
- Building a Modern Data Lake with Minio, Spark, Airflow via Docker.☆23May 11, 2024Updated last year
- Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.☆15Sep 3, 2021Updated 4 years ago
- ☆15Aug 28, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Geospatial Next.js app with DuckDB-Wasm☆15May 24, 2023Updated 2 years ago
- The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.☆52Dec 8, 2022Updated 3 years ago
- ☆22Mar 21, 2023Updated 3 years ago
- This is a boilerplate which has dependencies for pyspark(3.3.0) mongo(>4.x) connectivity☆10May 3, 2024Updated last year
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆103Jun 7, 2024Updated last year
- Schedules a bot to send a message everyday☆15May 22, 2023Updated 2 years ago
- Singapore Condo Rental Prices - From Data Acquisition to Prediction☆14Feb 13, 2021Updated 5 years ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆27Mar 25, 2024Updated 2 years ago
- Typings for Confluent Kafka Python Client☆27Updated this week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆19Feb 25, 2022Updated 4 years ago
- An example of running Testcontainer tests in CI pipelines.☆18Apr 13, 2025Updated 11 months ago
- Cast Spotify to your Raspberry Pi via the browser!☆17Oct 19, 2014Updated 11 years ago
- NSDictionary, NSArray, and NSSet categories that offer a method to recursively merge two objects together such that entries which exist f…☆14Feb 1, 2014Updated 12 years ago
- ☆16Apr 1, 2025Updated 11 months ago
- Mongo Aggregation Builder☆43Oct 1, 2014Updated 11 years ago
- ☆25Mar 15, 2024Updated 2 years ago
- A Python package extending pandas with helper functions for simpler exploratory data analysis and data wrangling.☆10Feb 6, 2025Updated last year
- Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.☆16Jun 19, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Data Analysis and Image Processing Python Course☆12Nov 4, 2014Updated 11 years ago
- Wrapper on top of pino which provides integration with cls-hooked for better context in log messages☆12Feb 11, 2022Updated 4 years ago
- A portable Datamart and Business Intelligence suite built with Docker, Airflow, dbt, duckdb and Superset☆48Mar 9, 2026Updated 2 weeks ago
- Generate and Compare Debezium CDC (Chance Data Capture) Avro Schema, directly from your Database.☆24Updated this week
- Topic Mine leverages 1st and/or 2nd party data to identify trending topics and uses our GEMINI to create relevant ads' texts. It generate…☆27May 16, 2025Updated 10 months ago
- A partially implemented ODBC driver for the Trino distributed SQL engine☆18Feb 2, 2026Updated last month
- small configuration for the home server.☆24Dec 27, 2022Updated 3 years ago
- Compare DuckDB, Polars and Pandas for generating an artificial dataset of persons and companies☆35Aug 31, 2023Updated 2 years ago
- Kafka in a Container☆14Dec 30, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆48Mar 14, 2024Updated 2 years ago
- A simple CLI command that initialises a Kedro project from an existing Python package☆11Aug 23, 2024Updated last year
- ☆10Jan 24, 2023Updated 3 years ago
- The "World Data Report" is a Power BI project that offers a detailed overview of global data, covering weather, geographical, demographic…☆15Nov 30, 2025Updated 3 months ago
- Source Code for the video series on developing a pushups logger web application with CRUD and user authentication features using Flask.☆29Sep 23, 2022Updated 3 years ago
- ☆11Apr 9, 2017Updated 8 years ago
- ☆12May 22, 2023Updated 2 years ago