Data Forge — a modern data stack playground to practice flows and best practices, not just tools. Spark, Trino, Kafka, Iceberg, ClickHouse, Airflow, MinIO, Superset — all wired together locally with Docker Compose.
☆176Oct 11, 2025Updated 8 months ago
Alternatives and similar repositories for data-forge
Users that are interested in data-forge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Building a Modern Data Lake with Minio, Spark, Airflow via Docker.☆23May 11, 2024Updated 2 years ago
- Getting Started with Data Enngineering☆1,324Apr 20, 2025Updated last year
- Collection of specific use cases for test your data lineage tool☆12Jun 14, 2024Updated 2 years ago
- Reading both XLSX and XLSB files, fast and memory-safe, with Python, into PyArrow☆12Feb 6, 2024Updated 2 years ago
- Queries the ACCESS_HISTORY and QUERY_HISTORY views, from the SNOWFLAKE.ACCOUNT_USAGE schema, and generates two interactive GraphViz visua…☆12Aug 28, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Roadmap для Data Engineer. Цель роадмапа – устроиться тебе на работу!☆489Mar 30, 2026Updated 2 months ago
- Snowflake scripts and useful snippets☆16Feb 2, 2025Updated last year
- Realtime Data Engineering Project☆31Jan 12, 2025Updated last year
- build dw with dbt☆55Oct 24, 2024Updated last year
- DWH powered by Clickhouse and dbt☆13Aug 4, 2024Updated last year
- Collection of Snowflake Scripting procedures extending GET_DDL function by dwh.dev.☆15Jul 23, 2024Updated last year
- python курс☆39May 20, 2026Updated 3 weeks ago
- Demonstration Database☆41Apr 2, 2026Updated 2 months ago
- ITSumma Spark Greenplum Connector☆43May 6, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Fast data quality framework for modern data infrastructure☆29Apr 2, 2026Updated 2 months ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆20Nov 12, 2021Updated 4 years ago
- ☆17May 22, 2023Updated 3 years ago
- Collection of cookiecutter starter templates for streamlit projects☆16Apr 20, 2022Updated 4 years ago
- Analytics Engineer Course☆20May 17, 2023Updated 3 years ago
- A table-type dbt materialization for Snowflake to enable Time Travel☆22Jan 12, 2026Updated 5 months ago
- Find Niquests at https://github.com/jawah/niquests HTTP/2 HTTP/3 QUIC Async☆12Oct 22, 2024Updated last year
- The bot that sends daily closed issues digest to our team☆52Jun 9, 2019Updated 7 years ago
- Transaction processing & vis pipeline using PySpark Streaming☆29Jul 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆23May 13, 2025Updated last year
- ☆20May 25, 2025Updated last year
- ☆11Oct 1, 2025Updated 8 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆193Jan 5, 2026Updated 5 months ago
- Scrivo - Helper for development for -Micropython/Python☆20Feb 14, 2025Updated last year
- The simple ETL with docker container☆68May 30, 2025Updated last year
- open source data lake☆32Jan 17, 2025Updated last year
- Django-based backend for our learning management system☆485Updated this week
- Adaptation postgres adapter for Greenplum☆36Mar 7, 2024Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineer…☆590Feb 5, 2026Updated 4 months ago
- ☆43Aug 12, 2025Updated 10 months ago
- Docker Compose Workspace manager☆17Dec 22, 2025Updated 5 months ago
- ПИК Комфорт для Home Assistant / PIK Comfort for Home Assistant☆18Aug 27, 2022Updated 3 years ago
- A PL/SQL package to solve real-world language problems: tokenizing, splitting, classifying, feedback message, and removing terminators.☆21Apr 15, 2023Updated 3 years ago
- Repository of Docker builds for Oracle databases.☆18Jul 3, 2023Updated 2 years ago
- An opinionated data-centric view of Debezium components. Please log issues at https://github.com/debezium/dbz/issues.☆46Updated this week