Awesome list for datapipeline
☆37Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- P2P Distributed deep learning framework that runs on PyTorch.☆24Apr 26, 2023Updated 3 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated last month
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 9 months ago
- Deep Learning framework for fast and clean research with Pytorch☆13Oct 9, 2020Updated 5 years ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!☆26Nov 8, 2022Updated 3 years ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- ☆14Jan 21, 2023Updated 3 years ago
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated last year
- ☆24Dec 21, 2020Updated 5 years ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆25Apr 27, 2023Updated 3 years ago
- A demo and tutorial for Council that implements a financial analyst agent.☆11Jun 21, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆15May 14, 2025Updated last year
- repo of files pertaining to realtime, offline translations using whisper realtime and argos translate. This repo is marked Creative Commo…☆19May 20, 2025Updated last year
- Integrate Claude Code and Gemini CLI into your Obsidian workflow☆26May 28, 2026Updated last week
- secure your api endpoint by limiting access over period of time.☆10Oct 18, 2019Updated 6 years ago
- This AI tool leverages different LLM services to generate product information from a given image. Simply upload an image of a product and…☆15Jun 25, 2024Updated last year
- PrintCSS Examples created over the time. Mostly HTML some Markdown.☆14Apr 15, 2022Updated 4 years ago
- How Media Cloud approaches extracting metadata from online news stories☆17Apr 15, 2026Updated last month
- The code repo for Youtube tutorial series about using Python asyncio with OpenCV to grab frames from video cameras concurrently☆16Oct 3, 2021Updated 4 years ago
- Summary and archive of Vatican .va (Holy See) ccTLD zone data for researchers.☆13Apr 26, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- a rust crate for easily implementing faster-whisper stt into your rust programs.☆24Oct 20, 2025Updated 7 months ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 3 years ago
- Bitcoin Hourly OHLCV with 70+ Technical Indicators | Daily Updated Dataset for ML & Trading Analysis☆26Updated this week
- Various stuff and tweaks I have around Obsidian☆13Jun 20, 2025Updated 11 months ago
- ☆14Jun 26, 2025Updated 11 months ago
- An Obsidian plugin to create meeting notes from Microsoft Outlook .msg files☆15Apr 2, 2025Updated last year
- ☆13Aug 20, 2021Updated 4 years ago
- The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing,…☆33Sep 7, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Safitty is a wrapper on JSON/YAML configs for Python☆30Mar 19, 2020Updated 6 years ago
- Data Engineering Project: Extracting music video metrics of Twice using YouTube API, AWS, and Tableau☆32Nov 21, 2023Updated 2 years ago
- Yonsei Natural Language Understanding tool☆12Dec 7, 2022Updated 3 years ago
- Cellular Automata - Pokemon Type Battle Simulation☆11Oct 26, 2024Updated last year
- Spark data pipeline that processes movie ratings data.☆31May 1, 2026Updated last month
- ☆13Dec 24, 2023Updated 2 years ago
- Translante Mobilenet v2 to Movidius stick.☆11Aug 14, 2018Updated 7 years ago