Awesome list for datapipeline
☆35Feb 6, 2023Updated 3 years ago
Alternatives and similar repositories for awesome-data-pipeline
Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- NoSQL extract, transform, load (ETL) toolkit with Python☆15Updated this week
- P2P Distributed deep learning framework that runs on PyTorch.☆24Apr 26, 2023Updated 2 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Mar 1, 2026Updated last month
- 🌟 An end-to-end full-stack data science project, including modelling, MLOps, and data storytelling. ✨☆16Aug 30, 2025Updated 7 months ago
- Deep Learning framework for fast and clean research with Pytorch☆13Oct 9, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- End-to-end ELT data engineering project☆22Dec 24, 2022Updated 3 years ago
- A simple HTTP webserver written in C.☆14Mar 28, 2023Updated 3 years ago
- Codebase, data and models for the Headline Grouping paper at NAACL2021☆12Oct 2, 2022Updated 3 years ago
- Reading comprehension based question-answering model for news articles.☆11Jun 22, 2022Updated 3 years ago
- Skooldio: Data Pipelines with Airflow☆23May 24, 2025Updated 10 months ago
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆23Nov 19, 2024Updated last year
- end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace☆11Aug 15, 2023Updated 2 years ago
- Open source RAG with Llama Index for Japanese LLM in low resource settting☆10May 12, 2025Updated 10 months ago
- ☆24Dec 21, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- automated insights for tabular data☆10Feb 10, 2025Updated last year
- Quora Paraphrasing Dataset Bahasa Indonesia Version☆11Apr 18, 2021Updated 4 years ago
- Fine grain access in Amazon Managed Workflows For Apache Airflow☆11Jul 30, 2021Updated 4 years ago
- A demo and tutorial for Council that implements a financial analyst agent.☆11Jun 21, 2024Updated last year
- Newspaper Segmentation into images and text☆12Jan 11, 2019Updated 7 years ago
- A self-contained, ready to run Airflow and Kafka project. Can be run locally or within codespaces.☆16Jul 15, 2023Updated 2 years ago
- An offline CPU-first low-resource chat application to perform RAG on your corpus of data. Powered by OpenChat and CTranslate2.☆14May 14, 2025Updated 10 months ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆24Jul 14, 2022Updated 3 years ago
- Terraform module which creates Redis ElastiCache resources on AWS.☆12Dec 9, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 3D Mesh Generation from 2D Images in Python☆13Feb 12, 2024Updated 2 years ago
- Triton backend for https://github.com/OpenNMT/CTranslate2☆11Aug 20, 2024Updated last year
- A collection of example database schemas meant to illustrate common patterns in database design☆21Mar 25, 2020Updated 6 years ago
- Integrate Claude Code and Gemini CLI into your Obsidian workflow☆24Aug 21, 2025Updated 7 months ago
- repo of files pertaining to realtime, offline translations using whisper realtime and argos translate. This repo is marked Creative Commo…☆19May 20, 2025Updated 10 months ago
- ☆26Dec 18, 2020Updated 5 years ago
- secure your api endpoint by limiting access over period of time.☆10Oct 18, 2019Updated 6 years ago
- This AI tool leverages different LLM services to generate product information from a given image. Simply upload an image of a product and…☆16Jun 25, 2024Updated last year
- Ncurses tool to view the internals of a PostgreSQL database☆16Jan 9, 2015Updated 11 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- The code repo for Youtube tutorial series about using Python asyncio with OpenCV to grab frames from video cameras concurrently☆16Oct 3, 2021Updated 4 years ago
- Summary and archive of Vatican .va (Holy See) ccTLD zone data for researchers.☆13Apr 26, 2023Updated 2 years ago
- A simple package of face detection☆14Nov 27, 2020Updated 5 years ago
- detecting the meotions using by analysing the sound of the person unsing python☆11Oct 7, 2019Updated 6 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- Simple ETL pipeline using Python☆29May 22, 2023Updated 2 years ago
- Targeted Aspect-based Sentiment Analysis on SentiHood Dataset (PyTorch)☆11Aug 4, 2020Updated 5 years ago