Pathairush/rdbms_to_hdfs_data_pipeline

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Pathairush/rdbms_to_hdfs_data_pipeline)

Pathairush / rdbms_to_hdfs_data_pipeline

A data pipeline moving data from a Relational database system (RDBMS) to a Hadoop file system (HDFS).

☆15

Alternatives and similar repositories for rdbms_to_hdfs_data_pipeline

Users that are interested in rdbms_to_hdfs_data_pipeline are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Stefen-Taime / ETL-Data-Pipeline-RDBMS-TO-HDFS-using-Airflow-Apache-Sqoop-Spark-Postgres-and-Hive
View on GitHub
This project aims to move the data from a Relational database system (RDBMS) to a Hadoop file system (HDFS)
☆11Apr 29, 2022Updated 4 years ago
mmphego / simple-etl
View on GitHub
☆16Jan 19, 2022Updated 4 years ago
NitinSPatil15 / Project-3-Data-Warehouse-with-AWS
View on GitHub
An ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables
☆16May 5, 2020Updated 6 years ago
vim89 / datapipelines-essentials-python
View on GitHub
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…
☆56May 6, 2023Updated 3 years ago
naveenkrsh / books
View on GitHub
☆15Jan 22, 2017Updated 9 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
DucAnhNTT / bigdata-ETL-pipeline
View on GitHub
The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a compl…
☆18Dec 26, 2023Updated 2 years ago
ExpediaGroup / shunting-yard
View on GitHub
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
siddharth271101 / Covid-19-and-Aviation-Industry
View on GitHub
The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technolog…
☆13Jun 26, 2022Updated 4 years ago
aliawan01 / YoutubeContent
View on GitHub
Contains code from Youtube Tutorials or Videos.
☆14Nov 24, 2025Updated 8 months ago
stevenhurwitt / reddit-streaming
View on GitHub
streaming eight subreddits from reddit api using kafka producer & spark structured streaming.
☆19Jul 1, 2026Updated 3 weeks ago
glagol-dsl / glagol-dsl
View on GitHub
A domain specific language that utilizes Domain-Driven Design
☆17Jan 21, 2024Updated 2 years ago
tatwan / airflow-spark-aws-emr
View on GitHub
☆12Mar 6, 2021Updated 5 years ago
zacharyt-cs / reddit-data-engineering
View on GitHub
An end-to-end data engineering pipeline to create a dashboard for the latest content on the r/Stocks subreddit
☆20Aug 5, 2022Updated 3 years ago
AswinKumar1 / Forced-Alignment
View on GitHub
GSoC'16 RedHen Labs
☆11Aug 22, 2016Updated 9 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
citydaoproject / app
View on GitHub
A web app to view CityDAO land parcels
☆35Dec 11, 2022Updated 3 years ago
1997mahadi / dbt-dlt-ingestion-pipeline
View on GitHub
☆21Aug 8, 2024Updated last year
NorthConcepts / DataPipeline-Examples
View on GitHub
DataPipeline Examples
☆17Updated this week
YMarrakchi / CICL
View on GitHub
Code accompanying the paper "Fighting Class Imbalance with Contrastive Learning" (MICCAI2021)
☆10Nov 24, 2021Updated 4 years ago
ron-rivest / game-theory-voting-system
View on GitHub
Code implementing game-theory based voting system by Emily Shen and Ron Rivest
☆11Jan 24, 2014Updated 12 years ago
rdisipio / qnlp
View on GitHub
NLP stuff with quantum computing
☆17Nov 9, 2020Updated 5 years ago
darienmt / CarND-LaneLines-P1
View on GitHub
Udacity Self Driving Car Nanodegree - Finding Lane Lines in a Video Stream
☆10Oct 30, 2018Updated 7 years ago
Joshua-omolewa / Stock_streaming_pipeline_project
View on GitHub
Built a real-time streaming pipeline to extract stock data, using Apache Nifi, Debezium, Kafka, and Spark Streaming. Loaded the transform…
☆28Oct 13, 2023Updated 2 years ago
georgesittas / minihaskell-compiler
View on GitHub
MiniHaskell compiler and interpreter with a Lucid-like dataflow IR
☆15Mar 5, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
Pathairush / airflow_hive_spark_sqoop
View on GitHub
A docker using the airflow with Hadoop ecosystem (hive, spark, and sqoop)
☆12May 2, 2021Updated 5 years ago
UniCourt / DataEngineering-Workshop1
View on GitHub
☆27Jul 19, 2024Updated 2 years ago
BBVA / data-refinery
View on GitHub
Data transformation
☆23Apr 18, 2021Updated 5 years ago
gazpachu / joanmira
View on GitHub
Joan Mira Studio Website
☆12Dec 6, 2025Updated 7 months ago
ypraw / stowtui
View on GitHub
A simple TUI for stow
☆15Apr 13, 2021Updated 5 years ago
sungchun12 / serverless-data-pipeline-gcp
View on GitHub
Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and p…
☆25Jun 4, 2019Updated 7 years ago
ndleah / northwind
View on GitHub
💼 SQL, company data analysis
☆17Nov 12, 2021Updated 4 years ago
justsml / dans-blog
View on GitHub
the new danlevy.net
☆16Updated this week
aukgit / scala-open-real-time-bidding-rtb
View on GitHub
Scala Real Time Bidding System using open-rtb protocol (openrtb) [IAB open RTB 2.3 specs] - Simulation
☆13Jun 27, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ba5tz / vbaTutorial
View on GitHub
Repo untuk kumpulan File dan Link Tutorial yang saya bahas pada Channel YouTube Andi Setiadi
☆25Nov 9, 2023Updated 2 years ago
thalesbruno / fastapi-boilerplate
View on GitHub
A FastAPI boilerplate application
☆11Sep 5, 2020Updated 5 years ago
albertovpd / automated_etl_google_cloud-social_dashboard
View on GitHub
A dashboard is worth a thousand words => https://datastudio.google.com/reporting/755f3183-dd44-4073-804e-9f7d3d993315
☆28Oct 30, 2021Updated 4 years ago
angelddaz / de-challenges
View on GitHub
Project based learning for Data Engineering fundamentals.
☆13Jan 15, 2021Updated 5 years ago
codingforentrepreneurs / Serverless-Python-Workflow-with-AWS-Lambda
View on GitHub
A tutorial to setup and deploy a simple Serverless Python workflow with REST API endpoints in AWS Lambda.
☆22Apr 22, 2020Updated 6 years ago
10Kang / DE_Zoomcamp2024_ZY
View on GitHub
Repository for Data Engineering Zoomcamp 2024
☆14Mar 25, 2024Updated 2 years ago
edo0xff / playwarlock
View on GitHub
Descarga películas y series gratis, fácil y rápido.
☆12Apr 21, 2026Updated 3 months ago