Awesome list for datapipeline
☆37Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NoSQL extract, transform, load (ETL) toolkit with Python☆16Jun 21, 2026Updated last week
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆18Mar 31, 2024Updated 2 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated 2 months ago
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 9 months ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!☆27Nov 8, 2022Updated 3 years ago
- ☆32Feb 2, 2026Updated 4 months ago
- Scrape South African news☆13May 22, 2023Updated 3 years ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- Keyword extraction using Scake, KeyBERT, Fine-tuning Transformer BERT-like models and ChatGPT.☆12May 22, 2023Updated 3 years ago
- velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…☆21Aug 12, 2025Updated 10 months ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆15Nov 19, 2024Updated last year
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 4 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆24Nov 19, 2024Updated last year
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated last year
- automated insights for tabular data☆10Feb 10, 2025Updated last year
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆25Apr 27, 2023Updated 3 years ago
- Quora Paraphrasing Dataset Bahasa Indonesia Version☆11Apr 18, 2021Updated 5 years ago
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- A self-contained, ready to run Airflow and Kafka project. Can be run locally or within codespaces.☆16Jul 15, 2023Updated 2 years ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆15May 14, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 3D Mesh Generation from 2D Images in Python☆13Feb 12, 2024Updated 2 years ago
- Integrate Claude Code and Gemini CLI into your Obsidian workflow☆27May 28, 2026Updated last month
- ☆25Dec 18, 2020Updated 5 years ago
- secure your api endpoint by limiting access over period of time.☆10Oct 18, 2019Updated 6 years ago
- This AI tool leverages different LLM services to generate product information from a given image. Simply upload an image of a product and…☆16Jun 25, 2024Updated 2 years ago
- A simple package of face detection☆14Nov 27, 2020Updated 5 years ago
- Summary and archive of Vatican .va (Holy See) ccTLD zone data for researchers.☆13Apr 26, 2023Updated 3 years ago
- detecting the meotions using by analysing the sound of the person unsing python☆11Oct 7, 2019Updated 6 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Highlights the current yank.☆12Jul 13, 2022Updated 3 years ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- Simple ETL pipeline using Python☆29May 22, 2023Updated 3 years ago
- Chrome extension: password generator from master key using PBKDF2 with SHA-256.☆19Sep 14, 2015Updated 10 years ago
- Targeted Aspect-based Sentiment Analysis on SentiHood Dataset (PyTorch)☆11Aug 4, 2020Updated 5 years ago
- ☆12Apr 9, 2021Updated 5 years ago
- A minimal, configurable and highly optimized markdown2html compiler, supports macros, watch mode, syntax highlighting, latex math and liv…☆14Aug 10, 2023Updated 2 years ago