This project shows how to capture changes from postgres database and stream them into kafka
☆42May 17, 2024Updated last year
Alternatives and similar repositories for changecapture-e2e
Users that are interested in changecapture-e2e are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆16Sep 19, 2023Updated 2 years ago
- This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark cluste…☆12Oct 11, 2023Updated 2 years ago
- This project showcases how to integrate the world of DevOps, focusing on Continuous Integration (CI) and Continuous Deployment (CD) with …☆14Dec 27, 2023Updated 2 years ago
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆25Jan 26, 2024Updated 2 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆50Dec 4, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆39Dec 18, 2023Updated 2 years ago
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆20Apr 25, 2024Updated 2 years ago
- This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆214Oct 23, 2023Updated 2 years ago
- An end-to-end data engineering pipeline that fetches data from Wikipedia, cleans and transforms it with Apache Airflow and saves it on Az…☆32Oct 2, 2023Updated 2 years ago
- Data Engineering Bootcamp☆31Aug 5, 2025Updated 8 months ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Nov 18, 2025Updated 5 months ago
- An end-to-end, containerized data pipeline for near-real-time user event analytics using Kafka, ClickHouse, Airflow, and PySpark. Made to…☆78Sep 12, 2025Updated 7 months ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- Dbt package for Apache Airflow inspired macros☆17Dec 21, 2025Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 8 months ago
- Building a Data Pipeline with an Open Source Stack☆59Jun 27, 2025Updated 10 months ago
- This repository showcases a collection of machine learning projects in various domains, demonstrating my skills and expertise as a data s…☆11Nov 20, 2023Updated 2 years ago
- This is an end to end MLOps system☆34Nov 27, 2025Updated 5 months ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆13Aug 26, 2023Updated 2 years ago
- GPT-4o Powered Calorie Detecor☆18May 29, 2024Updated last year
- This solution helps you deploy ETL processes and data storage resources to create an Insurance Lake using Amazon S3 buckets for storage, …☆35Mar 12, 2026Updated last month
- ☆10Jan 8, 2024Updated 2 years ago
- Automatically backing up your Postgres database using NodeJS☆13Nov 14, 2020Updated 5 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆43Sep 26, 2023Updated 2 years ago
- ☆10Jan 18, 2024Updated 2 years ago
- A demonstration of an ELT (Extract, Load, Transform) pipeline☆31Feb 19, 2024Updated 2 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- Project for "Data pipeline design patterns" blog.☆51Aug 6, 2024Updated last year
- KazeWP is a simple and flexible tool for managing multiple WordPress sites behind a Caddy reverse proxy server. Built with Docker and Bas…☆17Apr 28, 2025Updated last year
- TTS utility☆12Aug 2, 2020Updated 5 years ago
- ☆14Mar 11, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- This is an example of using MongoDB as both a source and sink.☆10May 21, 2020Updated 5 years ago
- Design pattern for orchestrating an incremental data ingestion pipeline using AWS Step Functions from an on premise location into an Amaz…☆29Jul 24, 2019Updated 6 years ago
- ☆15Nov 16, 2024Updated last year
- ☆13Jan 6, 2022Updated 4 years ago
- Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi☆120Dec 15, 2023Updated 2 years ago
- My talk at EuroPython 2016☆12Sep 8, 2021Updated 4 years ago