used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline
☆32Oct 25, 2023Updated 2 years ago
Alternatives and similar repositories for Data-Streaming-Project
Users that are interested in Data-Streaming-Project are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- End-to-end data pipeline that ingests, processes, and stores data. It uses Apache Airflow to schedule scripts that fetch data from an API…☆21Jul 26, 2024Updated last year
- The unique data management platform for Julia☆16Apr 25, 2022Updated 3 years ago
- Code for the Data Engineering Zoomcamp☆20Dec 12, 2022Updated 3 years ago
- Scan and monitor your network effortlessly! Nmap Prometheus Exporter provides insights into network health and security with Prometheus-c…☆15Oct 2, 2023Updated 2 years ago
- Deploy a complete data stack in just a couple of minutes.☆15Mar 6, 2024Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆46Jul 6, 2024Updated last year
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆17May 1, 2023Updated 2 years ago
- Explore tips and tricks to deploy machine learning models with Docker.☆13Jul 6, 2023Updated 2 years ago
- Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!☆12Dec 14, 2023Updated 2 years ago
- ☆19Jun 22, 2022Updated 3 years ago
- A well-documented explanation of data structure types including Linked List, Hash table, Binary Tree, Queues, Stack☆13Jul 30, 2022Updated 3 years ago
- Demonstration of LLM integration into a lex bot using Lambda codehooks and a Sagemaker endpoint.☆14Dec 20, 2023Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆52Dec 2, 2023Updated 2 years ago
- Single-click deployment, serverless data pipeline that moves Google Analytics raw data to S3 and ETL's it into BigQuery schema☆20Jun 2, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Docker Apache Airflow☆13Mar 1, 2023Updated 3 years ago
- A Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive M…☆47Dec 19, 2024Updated last year
- Scripts and tooling to migrate DW and Spark workloads to Fabric.☆27Apr 9, 2024Updated 2 years ago
- ☆15Mar 14, 2024Updated 2 years ago
- ☆24Dec 4, 2023Updated 2 years ago
- A backtest a day keeps the losses away!☆15Sep 11, 2023Updated 2 years ago
- ☆24Jul 21, 2022Updated 3 years ago
- A simple, customizable, and modern library for displaying alert banners in your Jetpack Compose and Compose Multiplatform applications.☆43Aug 17, 2025Updated 7 months ago
- A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).☆15Jun 3, 2021Updated 4 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Building a data warehousing with SQL Server, including ETL processes, data modeling & analysis.☆25Apr 15, 2025Updated 11 months ago
- Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash☆25Nov 12, 2022Updated 3 years ago
- Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆13Aug 26, 2023Updated 2 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- This repo is for the Linkedin Learning course: End-to-End Data Engineering Project☆31Nov 9, 2023Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Aug 18, 2024Updated last year
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆20Aug 12, 2025Updated 7 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)☆65Mar 9, 2024Updated 2 years ago
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- ☆26Aug 28, 2023Updated 2 years ago
- use flask and tesseract to have a basic ocr, also you need opencv2, this code use opencv2 to have a basic image process☆26May 3, 2017Updated 8 years ago
- I will share DSA notes and code here☆19Mar 24, 2023Updated 3 years ago
- Une liste de projets data professionnels pour enrichir ton portfolio☆52Feb 27, 2026Updated last month