mikestack15 / orangutan-stemLinks
An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reliability. Uniquely, these pipelines are field-tested on farms across Sumatra, Indonesia, ensuring real-world applicability and resilience.
☆33Updated last year
Alternatives and similar repositories for orangutan-stem
Users that are interested in orangutan-stem are comparing it to the libraries listed below
Sorting:
- Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; K…☆65Updated last week
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆262Updated 10 months ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆77Updated last year
- ☆21Updated last year
- In this repository we store all materials for dlt workshops, courses, etc.☆182Updated 3 weeks ago
- Repo for saving cheat sheets☆56Updated last year
- ☆131Updated 3 months ago
- End to end data engineering project☆56Updated 2 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆23Updated 2 years ago
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆44Updated 4 months ago
- Sample repo for startdataengineering DE 101 free course☆62Updated 11 months ago
- This repository provides various demos/examples of using Snowpark for Python.☆274Updated last year
- ☆130Updated 10 months ago
- Project for "Data pipeline design patterns" blog.☆45Updated 10 months ago
- Local Environment to Practice Data Engineering☆142Updated 5 months ago
- Repo for CDC with debezium blog post☆28Updated 8 months ago
- Code for dbt tutorial☆157Updated last year
- Data Engineering with Google Cloud Platform, published by Packt☆118Updated last year
- build dw with dbt☆45Updated 7 months ago
- Code for "Efficient Data Processing in Spark" Course☆313Updated 2 weeks ago
- Study Notes for the Snowflake SnowPro Core Certification Exam☆83Updated last week
- Data Pipeline from the Global Historical Climatology Network DataSet☆27Updated 2 years ago
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆67Updated last year
- Step by step instructions to create a production-ready data pipeline☆50Updated 5 months ago
- ☆144Updated last year
- ☆34Updated 2 years ago
- Stream processing pipeline from Finnhub websocket using Spark, Kafka, Kubernetes and more☆347Updated last year
- This project leverages GCS, Composer, Dataflow, BigQuery, and Looker on Google Cloud Platform (GCP) to build a robust data engineering so…☆24Updated last year
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 10 months ago
- ☆15Updated last year