KennethanCeyer/awesome-data-pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/KennethanCeyer/awesome-data-pipeline)

KennethanCeyer / awesome-data-pipeline

Awesome list for datapipeline

☆37

Alternatives and similar repositories for awesome-data-pipeline

Users that are interested in awesome-data-pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

VicenteYago / steam-data-engineering
View on GitHub
A data engineering project with Airflow, dbt, Terrafrom, GCP and much more!
☆27Nov 8, 2022Updated 3 years ago
Vinothsuku / insightsR
View on GitHub
automated insights for tabular data
☆10Feb 10, 2025Updated last year
kaoutaar / end-to-end-etl-pipeline-jcdecaux-API
View on GitHub
velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…
☆21Aug 12, 2025Updated 11 months ago
DeepsMoseli / Siamese-LSTM-on-sentence-similarity
View on GitHub
Using Siamese LSTM to classify repeated quora questions. Attempted pretrained bert embeddings, Word2Vec and training own embeddings toget…
☆10Aug 28, 2020Updated 5 years ago
smitkiri / news-qa
View on GitHub
Reading comprehension based question-answering model for news articles.
☆11Jun 22, 2022Updated 4 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
AbdulRehman555 / 3D-Mesh-Generation
View on GitHub
3D Mesh Generation from 2D Images in Python
☆13Feb 12, 2024Updated 2 years ago
zkan / data-pipelines-with-airflow
View on GitHub
Skooldio: Data Pipelines with Airflow
☆23May 24, 2025Updated last year
yang0369 / Information_Extraction
View on GitHub
end-to-end information extraction pipeline built by LayoutLMV2, pretrained model from HuggingFace
☆11Aug 15, 2023Updated 2 years ago
chain-ml / council-financial-analyst-agent
View on GitHub
A demo and tutorial for Council that implements a financial analyst agent.
☆11Jun 21, 2024Updated 2 years ago
hollaugo / crewai-sales-report-generator
View on GitHub
This repo contains the code for the tutorial for using the CrewAI agent framework to generate Sales Reports based on Salesforce data
☆13Mar 16, 2024Updated 2 years ago
zkan / dtc-data-engineering-zoomcamp-project
View on GitHub
DataTalks.Club's Data Engineering Zoomcamp Project
☆24Jul 14, 2022Updated 4 years ago
yussan / seal-middleware-npm
View on GitHub
secure your api endpoint by limiting access over period of time.
☆10Oct 18, 2019Updated 6 years ago
mediacloud / metadata-lib
View on GitHub
How Media Cloud approaches extracting metadata from online news stories
☆17Apr 15, 2026Updated 3 months ago
CodersCreative / faster-whisper-rs
View on GitHub
a rust crate for easily implementing faster-whisper stt into your rust programs.
☆24Oct 20, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ShahriyarR / concurrent-camera-reader
View on GitHub
The code repo for Youtube tutorial series about using Python asyncio with OpenCV to grab frames from video cameras concurrently
☆16Oct 3, 2021Updated 4 years ago
aigalaxy / voice-emotion-recognition
View on GitHub
detecting the meotions using by analysing the sound of the person unsing python
☆11Oct 7, 2019Updated 6 years ago
imsanjoykb / ETL-Project
View on GitHub
The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing,…
☆33Sep 7, 2021Updated 4 years ago
Nix07 / Utilizing-BERT-for-Aspect-Based-Sentiment-Analysis
View on GitHub
Targeted Aspect-based Sentiment Analysis on SentiHood Dataset (PyTorch)
☆11Aug 4, 2020Updated 5 years ago
kometenstaub / obsidian-vim-yank-highlight
View on GitHub
Highlights the current yank.
☆12Jul 13, 2022Updated 4 years ago
damklis / etljob
View on GitHub
Simple ETL pipeline using Python
☆29May 22, 2023Updated 3 years ago
MaxHalford / data-science-tutorials
View on GitHub
☆15Nov 28, 2023Updated 2 years ago
larsrinn / papermill-lambda
View on GitHub
☆12Oct 12, 2018Updated 7 years ago
ani8897 / Realtime-Action-Recognition
View on GitHub
Contains code for C3D, LCN and TSM for action recognition models.
☆10May 31, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
anantshri / Obsidian_stuff
View on GitHub
Various stuff and tweaks I have around Obsidian
☆13Jun 20, 2025Updated last year
tuanai-vireox / gcp-professional-data-engineer
View on GitHub
GCP Professional Data Engineer Certification- Learning
☆25Jan 1, 2025Updated last year
dachosen1 / Common-Voice
View on GitHub
Audio Classification with machine learning
☆18Jun 8, 2026Updated last month
advanced-security / awesome-secret-scanning
View on GitHub
A curated list of awesome GitHub Advanced Security secret scanning resources.
☆17Updated this week
davidingerslev / outlook-meeting-notes
View on GitHub
An Obsidian plugin to create meeting notes from Microsoft Outlook .msg files
☆14Apr 2, 2025Updated last year
mrp-yt / termux_ssh
View on GitHub
Short guide on how to connect to Termux SSH from anywhere while using TailScale as connection link.
☆15Aug 30, 2021Updated 4 years ago
PacktPublishing / Python-for-Beginners-Learn-Python-from-Scratch
View on GitHub
Code repository for Python for Beginners: Learn Python from Scratch, published by Packt
☆16Oct 16, 2023Updated 2 years ago
razevedo1994 / razv-data-engineering
View on GitHub
Portfolio of projects and studies conducted in data engineering.
☆34Feb 22, 2025Updated last year
prabhnoor0212 / Siamese-Network-Text-Similarity
View on GitHub
This is a basic Siamese network for checking similarity of texts. The example used is of Kaggle Question Pair Similarity Datset.
☆13Apr 30, 2020Updated 6 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
totpero / awesome-qgis
View on GitHub
An awesome list that curates the best QGis frameworks, libraries, tools, plugins, tutorials, articles,resources and more.
☆28Apr 6, 2026Updated 3 months ago
mortymacs / abcmeta
View on GitHub
Python meta class and abstract method library with restrictions.
☆11Jan 23, 2026Updated 5 months ago
guidok91 / spark-movies-etl
View on GitHub
Spark data pipeline that processes movie ratings data.
☆31Jul 12, 2026Updated last week
Wenuka / correlationTracker
View on GitHub
Multiple object tracking using dlib library, Python
☆10Nov 6, 2017Updated 8 years ago
xufeifeiWHU / Mobilenet-v2-on-Movidius-stick
View on GitHub
Translante Mobilenet v2 to Movidius stick.
☆11Aug 14, 2018Updated 7 years ago
sanjaynaikwadi / kubernetes
View on GitHub
Examples of deployments, pods, configmap, autoscaling
☆12Jun 7, 2020Updated 6 years ago
STIX-Modeler / UI
View on GitHub
STIX 2.1 Data Modeling Tool
☆28Jul 2, 2024Updated 2 years ago