How to build an awesome data engineering team
☆101Sep 11, 2019Updated 6 years ago
Alternatives and similar repositories for data-engineering
Users that are interested in data-engineering are comparing it to the libraries listed below
Sorting:
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆897May 8, 2022Updated 3 years ago
- ☆32Aug 13, 2018Updated 7 years ago
- A list of useful resources to learn Data Engineering from scratch☆3,966Jun 19, 2024Updated last year
- ☆13Oct 6, 2019Updated 6 years ago
- Sharing interesting and noteworthy Data Engineering content☆70Oct 21, 2016Updated 9 years ago
- Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake developme…☆1,845Aug 26, 2022Updated 3 years ago
- A curated list of data engineering tools for software developers☆8,385Feb 21, 2026Updated 3 weeks ago
- Keras implementation of the "Show, Attend and Tell" paper☆26Apr 21, 2019Updated 6 years ago
- All Machine Learning Algorithms☆27Oct 17, 2020Updated 5 years ago
- Simple demonstration of how to build a complex real time machine learning visualization tool.☆16Mar 26, 2016Updated 9 years ago
- Spark cloud integration: tests, cloud committers and more☆20Jan 30, 2025Updated last year
- ☆197Feb 25, 2022Updated 4 years ago
- ☆16Jun 25, 2019Updated 6 years ago
- Vietnamese Named Entity Recognition☆31Oct 12, 2020Updated 5 years ago
- ☆10Aug 18, 2021Updated 4 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆89Jul 17, 2019Updated 6 years ago
- ETL Pipeline using Luigi☆10Nov 15, 2017Updated 8 years ago
- Creates an example AWS DMS for replicating an (on-prem) Oracle database to a cloud-based Postgres database☆13Oct 24, 2017Updated 8 years ago
- Repo to migrate old wiki to, esp for devs and code examples☆183Oct 18, 2016Updated 9 years ago
- Python client for the Serf orchestration tool☆21Apr 30, 2021Updated 4 years ago
- ☆44Apr 21, 2022Updated 3 years ago
- ☆10Nov 7, 2022Updated 3 years ago
- An Awesome List of Open-Source Data Engineering Projects☆3,059Oct 4, 2024Updated last year
- Automatic labelling and model tuning with amazon sagemaker☆12Jun 10, 2019Updated 6 years ago
- A simple Spark-powered ETL framework that just works 🍺☆185Oct 2, 2025Updated 5 months ago
- funnel analysis data challenge☆13Jan 7, 2018Updated 8 years ago
- In this workshop you will launch an Amazon Redshift cluster in your AWS account and load sample data ~ 100GB using TPCH dataset. You will…☆24Nov 28, 2018Updated 7 years ago
- Found a data engineering challenge or participated in a selection process ? Share with us!☆67Oct 17, 2022Updated 3 years ago
- scripts for personal reference☆19Dec 26, 2022Updated 3 years ago
- Poverty Prediction by Combination of Satellite Imagery☆43Nov 23, 2020Updated 5 years ago
- Know your ML Score based on Sculley's paper☆34Apr 22, 2019Updated 6 years ago
- Software mention extraction and linking from scientific articles☆14Sep 2, 2022Updated 3 years ago
- ☆13Jun 20, 2018Updated 7 years ago
- ☆13Aug 22, 2025Updated 6 months ago
- Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage dat…☆16Jan 21, 2021Updated 5 years ago
- Reference Architectures for Datalakes on AWS☆78May 13, 2020Updated 5 years ago
- Ebook code for Data Science with R: A Resource Compendium☆16Oct 4, 2024Updated last year
- Full Stack Data Science projects centered around Apache Spark Streaming for educational purpose.☆19May 1, 2023Updated 2 years ago
- Spark Custome Stream Source and Sink☆12Jan 19, 2019Updated 7 years ago