π¦ Batch data pipeline with Airflow, DuckDB, Delta Lake, Trino, MinIO, and Metabase. Full observability and data quality.
β88Nov 5, 2025Updated 6 months ago
Alternatives and similar repositories for batch-data-pipeline
Users that are interested in batch-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Hexagonal (ports and adapters) architecture applied to Spark and Python data engineering projectβ33Jul 26, 2023Updated 2 years ago
- An open and introductory book for the Python API of Apache Spark (pyspark) ππβ12Sep 19, 2025Updated 7 months ago
- β15Mar 29, 2024Updated 2 years ago
- Estudo OPEN SOURCE sobre a rotatividade e renovaΓ§Γ£o de tΓ©cnicos do Futebol Brasileiro.β36Sep 6, 2025Updated 8 months ago
- Content related to Mastering Postgresql along with videos.β20Aug 18, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- β13Feb 20, 2026Updated 2 months ago
- markup to create labs for courses from the Google Cloud training catalog.β49Sep 15, 2025Updated 7 months ago
- Data Engineering Projects using Mage.ai as orchestratorβ19Jan 20, 2026Updated 3 months ago
- NSCollectionView sample for OS X 10.11 ElCapitanβ12Nov 24, 2017Updated 8 years ago
- Use MobileNet SSD and openCV to detect and count car on roadβ11Jan 13, 2020Updated 6 years ago
- A python script to convert your youtube URL to an mp3 file and download it to the same directory as the .py file.β10May 20, 2025Updated 11 months ago
- Transcribe speech to text, then receive a virtual assistant response to what you say from openaiβ16Sep 23, 2022Updated 3 years ago
- Deploy a complete data stack in just a couple of minutes.β15Mar 6, 2024Updated 2 years ago
- β13Apr 24, 2026Updated last week
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.β16Jun 19, 2022Updated 3 years ago
- A sophisticated exploration of dbt macro capabilities, pushing the boundaries of what's possible with dbt's macro system.β18Feb 4, 2026Updated 3 months ago
- β16Apr 18, 2025Updated last year
- β11Nov 21, 2023Updated 2 years ago
- β31Aug 21, 2025Updated 8 months ago
- um its my portfolio?β16Feb 10, 2026Updated 2 months ago
- β11Feb 24, 2022Updated 4 years ago
- β16Nov 27, 2025Updated 5 months ago
- This workshop will familiarize you with some of the key steps towards building an autonomous driving data lake and extracting images fromβ¦β10Jul 12, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An example of a project generated with cookiecutter-uvβ15Apr 10, 2026Updated 3 weeks ago
- http://archive.ics.uci.edu/ml/index.htmlβ12Jan 25, 2020Updated 6 years ago
- Create an Anime database containing all the Anime currently available on the website, which includes: 'Anime Title', 'Description', 'Cβ¦β12Jun 10, 2020Updated 5 years ago
- Batch Processing , orchestration using Apache Airflow and Google Workflows, spark structured Streaming and a lot moreβ18Jun 21, 2022Updated 3 years ago
- β22Mar 15, 2011Updated 15 years ago
- Miscellaneous codes and writings for MLOpsβ15Apr 8, 2026Updated 3 weeks ago
- Interactive web-based dashboard to manage traffic flow using YOLOX, DeepSORTβ12Jul 30, 2022Updated 3 years ago
- Create and Run π Dotfiles projects for Windows 10/11β23Jan 26, 2025Updated last year
- General Assembly's Data Science Immersive Capstone Projectβ13Sep 9, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Machine Learning Model and Deployment for Classification of Mango Varietiesβ10Dec 22, 2022Updated 3 years ago
- A local-first, terminal-based password manager built for people who care about security, simplicity, and controlβ37Dec 31, 2025Updated 4 months ago
- β56Jul 15, 2013Updated 12 years ago
- β10Jan 27, 2025Updated last year
- Apache Airflow advanced functionalities examplesβ21Mar 22, 2024Updated 2 years ago
- Create a chatbot that provides responses in Vietnamese, focusing on the products offered by a flower shopβ11Nov 14, 2024Updated last year
- This repository provides a set of pre-configured settings to help you quickly set up and start using Obsidianβ17Jan 19, 2024Updated 2 years ago