velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docker Compose.
☆21Aug 12, 2025Updated 10 months ago
Alternatives and similar repositories for end-to-end-etl-pipeline-jcdecaux-API
Users that are interested in end-to-end-etl-pipeline-jcdecaux-API are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data Pipeline from the Global Historical Climatology Network DataSet☆27Dec 20, 2022Updated 3 years ago
- Public data and analytics for our open course☆34Mar 22, 2024Updated 2 years ago
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆24Nov 19, 2024Updated last year
- NoSQL extract, transform, load (ETL) toolkit with Python☆16Jun 21, 2026Updated last week
- Sample project to demonstrate data engineering best practices☆220Feb 24, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆18Mar 31, 2024Updated 2 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated 2 months ago
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 9 months ago
- A real-time reddit data streaming pipeline for sentiment analysis of various subreddits☆149Aug 23, 2023Updated 2 years ago
- ☆30Feb 11, 2024Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆51Dec 2, 2023Updated 2 years ago
- Creating a REST API with Python on Synapse Serverless pools using external tables☆12Dec 27, 2021Updated 4 years ago
- StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit☆16Mar 14, 2024Updated 2 years ago
- Data pipeline that scrapes Rust cheater Steam profiles☆54Feb 13, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Data Engineering Project to Extract and Process Solana Reddit Data☆40Feb 3, 2024Updated 2 years ago
- Ecosystem website for Apache Flink☆12Jan 22, 2024Updated 2 years ago
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- Microsoft 365 Defender Hunting via PowerShell.☆14Feb 8, 2022Updated 4 years ago
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆72Dec 17, 2023Updated 2 years ago
- ☆10Aug 20, 2024Updated last year
- This project shows how to capture changes from postgres database and stream them into kafka☆42May 17, 2024Updated 2 years ago
- Function to rotate storage account keys stored in key vault as secret☆13Nov 15, 2023Updated 2 years ago
- A Python Snowpark CLI for loading the TPC-DI dataset into Snowflake. Additional dbt models for building the data warehouse.☆11Sep 4, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆16Jan 19, 2022Updated 4 years ago
- a azure monitor workbook for LogicApps☆18Apr 10, 2020Updated 6 years ago
- This repository contains notebooks with different probability density function estimators.☆13Jun 4, 2020Updated 6 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated last year
- ☆11Dec 28, 2020Updated 5 years ago
- ☆15May 1, 2024Updated 2 years ago
- A fully serverless, event-driven data pipeline that ingests, enriches, validates, and visualizes real-time news data using AWS services. …☆25Aug 10, 2025Updated 10 months ago
- This repository contains a Docker Compose configuration for running ScyllaDB, a highly scalable NoSQL database for learning and testing.☆14Sep 19, 2024Updated last year
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆25Apr 27, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Leveraging Hortonworks' HDP 3.1.0 and HDF 3.4.0 components, this tutorial guides the user through steps to stream data from a REST API in…☆19Aug 16, 2019Updated 6 years ago
- This project focuses on building a robust data pipeline using Apache Airflow to automate the ingestion of weather data from the OpenWeath…☆22Feb 3, 2026Updated 4 months ago
- Code for data quality with greatexpectations blog☆13Jul 30, 2024Updated last year
- ☆14Updated this week
- rust-for-data☆53Jul 12, 2023Updated 2 years ago
- Pulsar Presto (outdated), go to https://github.com/apache/pulsar-sql instead☆18Oct 18, 2024Updated last year
- ☆13Feb 27, 2024Updated 2 years ago