Repo to migrate old wiki to, esp for devs and code examples
☆183Oct 18, 2016Updated 9 years ago
Alternatives and similar repositories for data-engineering-ecosystem
Users that are interested in data-engineering-ecosystem are comparing it to the libraries listed below
Sorting:
- Sharing interesting and noteworthy Data Engineering content☆70Oct 21, 2016Updated 9 years ago
- Data Engineering Project at Insight☆15Nov 17, 2015Updated 10 years ago
- Ansible playbook to deploy distributed technologies☆67Nov 20, 2017Updated 8 years ago
- ☆14Jun 27, 2017Updated 8 years ago
- An API to Analyze Cab GeoLocation Data and a Simulated App for finding an available cab in Real-Time☆62Feb 23, 2015Updated 11 years ago
- Red Hat's business logic for maintaining marketing data quality☆12Oct 21, 2021Updated 4 years ago
- A way for home buyers to know about factors affecting a state☆48Mar 2, 2019Updated 7 years ago
- ☆26Aug 23, 2017Updated 8 years ago
- Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups☆897May 8, 2022Updated 3 years ago
- Challenge for those applying to the Software Engineer, Big Data position☆35Oct 12, 2011Updated 14 years ago
- A curated list of data engineering tools for software developers☆8,385Feb 21, 2026Updated 3 weeks ago
- Building Scio from scratch step by step☆20May 20, 2019Updated 6 years ago
- How to build an awesome data engineering team☆101Sep 11, 2019Updated 6 years ago
- Random implementation notes☆34Apr 23, 2013Updated 12 years ago
- In the Data Science and Engineering program, engineering professionals combine the skills of software programmer, database manager, and s…☆29Nov 4, 2017Updated 8 years ago
- Udacity Data Engineering Nano Degree (DEND)☆189Jan 20, 2020Updated 6 years ago
- Examples of deploying scikit, spaCy and Keras (TensorFlow) machine learning models to AWS Lambda with Serverless framework and Python 3.☆31Dec 8, 2022Updated 3 years ago
- Data pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs…☆87Feb 11, 2014Updated 12 years ago
- Companion code for YouTube video: https://www.youtube.com/watch?v=Og7CGAfSr_Y&feature=youtu.be☆23Sep 25, 2014Updated 11 years ago
- My Data Engineering project @ Insight Data Science☆10Jul 23, 2018Updated 7 years ago
- Quora Kaggle Competition : Natural Language Processing using word2vec embeddings, scikit-learn and xgboost for training☆18Jan 13, 2019Updated 7 years ago
- Example end to end data engineering project.☆1,394Dec 8, 2022Updated 3 years ago
- Case study describing Red Hat Marketing Operations use of Luigi on top of Openshift☆12Apr 17, 2017Updated 8 years ago
- Play with various big data technologies☆10Jul 12, 2017Updated 8 years ago
- ☆12Nov 4, 2023Updated 2 years ago
- This repo contains commands that data engineers use in day to day work.☆61Feb 4, 2023Updated 3 years ago
- ☆12Mar 31, 2020Updated 5 years ago
- A Rust based deduplication tool☆34Jun 26, 2025Updated 8 months ago
- A library, that provides Conflict Free Replicated Data Types (CRDTs) for distributed Python applications.☆17Jan 10, 2019Updated 7 years ago
- Code to build a simple analytics data pipeline with Python☆102Mar 11, 2017Updated 9 years ago
- DB2/DashDB Connector for Apache Spark☆14Jul 30, 2021Updated 4 years ago
- The Data Engineering Cookbook☆14,989Jan 17, 2026Updated 2 months ago
- Cloudformation template for deploying Presto on AWS☆13Jul 20, 2020Updated 5 years ago
- A list of useful resources to learn Data Engineering from scratch☆3,966Jun 19, 2024Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆166Jun 16, 2020Updated 5 years ago
- Airflow-Salesforce connector☆16Jul 5, 2017Updated 8 years ago
- Companion code for the Mastering Advanced Scala book https://leanpub.com/mastering-advanced-scala☆35Mar 20, 2021Updated 5 years ago
- ☆13Feb 26, 2025Updated last year
- calling R from a Rails app☆10Mar 17, 2016Updated 10 years ago