coredataengineers / CDE-BOOTCAMPView external linksLinks
Code base for CDE bootcamp
☆74Jan 17, 2026Updated 3 weeks ago
Alternatives and similar repositories for CDE-BOOTCAMP
Users that are interested in CDE-BOOTCAMP are comparing it to the libraries listed below
Sorting:
- ☆76Aug 22, 2024Updated last year
- ☆10Apr 8, 2024Updated last year
- ☆15Dec 2, 2020Updated 5 years ago
- ☆14Feb 17, 2024Updated last year
- ☆25Mar 12, 2024Updated last year
- A step-by-step learning journey with dltHub: building modern, Python-based data ingestion pipelines from beginner to advanced.☆28Oct 17, 2025Updated 3 months ago
- Yahoo! news dataset of DeepCom (EMNLP2019)☆18Jan 21, 2021Updated 5 years ago
- Comparing PyTorch, JIT and ONNX for inference with Transformers☆20Feb 22, 2021Updated 4 years ago
- ☆19Apr 3, 2024Updated last year
- ☆21Aug 8, 2024Updated last year
- Blog post on ETL pipelines with Airflow☆24Aug 31, 2025Updated 5 months ago
- Terraform module to create AWS Batch resources 🇺🇦☆39Jan 8, 2026Updated last month
- Data Engineering on GCP☆41Oct 20, 2022Updated 3 years ago
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆48Dec 1, 2022Updated 3 years ago
- Apache Airflow https://airflow.apache.org☆50Updated this week
- This is a repository of python notebooks used in my blog posts on medium.☆47Aug 7, 2020Updated 5 years ago
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Sep 7, 2023Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆60Aug 6, 2024Updated last year
- A Rust based data/CSV/Parquet file generator☆64Mar 3, 2025Updated 11 months ago
- ☆90Sep 14, 2022Updated 3 years ago
- Spark runtime on AWS Lambda☆113Aug 28, 2025Updated 5 months ago
- A real-time reddit data streaming pipeline for sentiment analysis of various subreddits☆143Aug 23, 2023Updated 2 years ago
- Deploy transformers serverless on AWS Lambda☆122Aug 20, 2021Updated 4 years ago
- 📈 The panel-highcharts package makes it easy to use HighCharts in Python, Notebooks and with HoloViz Panel.☆159Oct 19, 2022Updated 3 years ago
- ☆168May 20, 2022Updated 3 years ago
- Sample project to demonstrate data engineering best practices☆202Feb 24, 2024Updated last year
- In this repository we store all materials for dlt workshops, courses, etc.☆248Dec 11, 2025Updated 2 months ago
- Code from the book Fighting Churn With Data☆311Aug 2, 2025Updated 6 months ago
- Delta Lake helper methods in PySpark☆327Jan 19, 2026Updated 3 weeks ago
- Projects done in the Data Engineering Nanodegree by Udacity.com☆273Aug 7, 2019Updated 6 years ago
- Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.☆347Jan 12, 2022Updated 4 years ago
- Source Code for 'Hands-on Time Series Analysis with Python' by B V Vishwas and Ashish Patel☆366Sep 8, 2020Updated 5 years ago
- The resources of the preparation course for Databricks Data Engineer Associate certification exam☆590Dec 26, 2025Updated last month
- Apache Airflow - OpenApi Client for Python☆445Jan 22, 2026Updated 3 weeks ago
- Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Jo…☆38,379Updated this week
- PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster☆488Oct 15, 2024Updated last year
- Simple web app example serving a PyTorch model using streamlit and FastAPI☆502Feb 12, 2024Updated 2 years ago
- My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggrega…☆508Aug 24, 2022Updated 3 years ago
- pg_lake: Postgres with Iceberg and data lake access☆1,407Feb 9, 2026Updated last week