Awesome list for datapipeline
☆36Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NoSQL extract, transform, load (ETL) toolkit with Python☆16May 9, 2026Updated last week
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆17Mar 31, 2024Updated 2 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated last month
- StarCraft 2 Data Pipeline with Airflow, DuckDB and Streamlit☆16Mar 14, 2024Updated 2 years ago
- Visualizing the ups and downs of bitcoin from the beginning with ALL price data available which you can download☆17Apr 29, 2026Updated 3 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Federated Learning of Diffusion Models☆12Aug 30, 2023Updated 2 years ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- Solana Smart contracts for Solana Blockchain protocol☆13Nov 16, 2024Updated last year
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated 11 months ago
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆23Nov 19, 2024Updated last year
- ETL of newspaper article keywords using Apache Airflow, Newspaper3k, Quilt T4 and AWS S3☆16May 11, 2026Updated last week
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆24Apr 27, 2023Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆10Dec 2, 2025Updated 5 months ago
- Fine grain access in Amazon Managed Workflows For Apache Airflow☆11Jul 30, 2021Updated 4 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Terraform module which creates Redis ElastiCache resources on AWS.☆12Dec 9, 2022Updated 3 years ago
- Integrate Claude Code and Gemini CLI into your Obsidian workflow☆25Aug 21, 2025Updated 8 months ago
- ☆25Dec 18, 2020Updated 5 years ago
- secure your api endpoint by limiting access over period of time.☆10Oct 18, 2019Updated 6 years ago
- This AI tool leverages different LLM services to generate product information from a given image. Simply upload an image of a product and…☆15Jun 25, 2024Updated last year
- Ncurses tool to view the internals of a PostgreSQL database☆17Jan 9, 2015Updated 11 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Basic Spark examples.☆11Jan 12, 2021Updated 5 years ago
- How Media Cloud approaches extracting metadata from online news stories☆17Apr 15, 2026Updated last month
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆28May 15, 2020Updated 6 years ago
- detecting the meotions using by analysing the sound of the person unsing python☆11Oct 7, 2019Updated 6 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- Targeted Aspect-based Sentiment Analysis on SentiHood Dataset (PyTorch)☆11Aug 4, 2020Updated 5 years ago
- A desktop application to download historical data of desired crypto assets by connecting several different crypto-exchanges' API☆16Aug 6, 2021Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Various stuff and tweaks I have around Obsidian☆12Jun 20, 2025Updated 11 months ago
- Bagpipes spaCy is a collection of custom spaCy pipeline components designed to enhance text processing capabilities.☆21Aug 15, 2024Updated last year
- ☆15Dec 5, 2023Updated 2 years ago
- ☆15Nov 28, 2023Updated 2 years ago
- A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apa…☆29Jun 7, 2023Updated 2 years ago
- ☆14Jun 26, 2025Updated 10 months ago
- Synchronize properties from your Obsidian notes with a Markwhen timeline file.☆12Sep 20, 2025Updated 8 months ago