Simple ETL pipeline using Python
☆29May 22, 2023Updated 2 years ago
Alternatives and similar repositories for etljob
Users that are interested in etljob are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Swarming behaviour is based on aggregation of simple drones exhibiting basic instinctive reactions to stimuli. However, to achieve overal…☆12Dec 2, 2019Updated 6 years ago
- Example end to end data engineering project.☆1,411Dec 8, 2022Updated 3 years ago
- simple ETL example☆16Jun 1, 2020Updated 5 years ago
- Pyspark Spotify ETL☆17Aug 19, 2021Updated 4 years ago
- End-to-end ELT data engineering project☆23Dec 24, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- NoSQL extract, transform, load (ETL) toolkit with Python☆16Apr 26, 2026Updated 2 weeks ago
- Apache Spark using SQL☆14Aug 18, 2021Updated 4 years ago
- A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Doc…☆23Nov 19, 2024Updated last year
- Price Crawler - Tracking Price Inflation☆205Jun 23, 2020Updated 5 years ago
- This repository contains tasks on how to build an ETL pipeline for the online transaction data of an e-commerce company.☆18Jun 27, 2023Updated 2 years ago
- This data project can be used as a take-home assignment to learn Pyspark and Data Engineering.☆18Feb 19, 2023Updated 3 years ago
- ☆41Apr 30, 2026Updated last week
- Beginner data engineering project - batch edition☆581Apr 13, 2026Updated 3 weeks ago
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆23May 14, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆78Sep 2, 2023Updated 2 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆56May 6, 2023Updated 3 years ago
- Simple, easy, parallel map function for python☆10Feb 11, 2022Updated 4 years ago
- A batch Data Pipeline that retrieves data from a user purchase table and a movie review table and is transformed to form a user behaviour…☆18Aug 14, 2025Updated 8 months ago
- Built a stream processing data pipeline to get data from disparate systems into a dashboard using Kafka as an intermediary.☆29Aug 14, 2023Updated 2 years ago
- an end-to-end data pipeline extracting music listening habits and producing an insightful dashboard☆17Mar 31, 2024Updated 2 years ago
- For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retri…☆26Feb 9, 2021Updated 5 years ago
- Spark Structured Streaming data pipeline that processes movie ratings data in real-time.☆14Apr 15, 2026Updated 3 weeks ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Apr 29, 2021Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆16Jul 15, 2023Updated 2 years ago
- Portfolio of projects and studies conducted in data engineering.☆34Feb 22, 2025Updated last year
- Step by step instructions to create a production-ready data pipeline☆60Dec 23, 2024Updated last year
- Data Pipeline from the Global Historical Climatology Network DataSet☆27Dec 20, 2022Updated 3 years ago
- Core Java Basic Program☆16Oct 30, 2025Updated 6 months ago
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆800Mar 10, 2026Updated last month
- Example gaming leaderboard application covering streaming ingestion, CDC enrichment, processing and visualisation including demo of advan…☆21Nov 18, 2025Updated 5 months ago
- Resources and projects from Udacity Data Engineering with AWS nano degree programme☆29Apr 12, 2023Updated 3 years ago
- Rust SQL transformation engine with branches, replay, column-level lineage, compile-time type safety, and per-model cost attribution. Sin…☆230Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆26Mar 31, 2025Updated last year
- ☆11Jan 9, 2022Updated 4 years ago
- Build a data pipeline with Apache Airflow☆11May 7, 2021Updated 5 years ago
- A repo to track data engineering projects☆13Nov 11, 2022Updated 3 years ago
- Codes, datasets, and explanations for some basic natural language tasks and models.☆11Dec 9, 2020Updated 5 years ago
- This course introduced me to three cutting-edge technologies for privacy-preserving AI: Federated Learning, Differential Privacy, and Enc…☆11Sep 2, 2019Updated 6 years ago
- A multipurpose Wireless Surveillance Rover an Electronics project made with arduino.☆15Dec 14, 2018Updated 7 years ago