A simple and easy to use Data Quality (DQ) tool built with Python.
☆51Sep 7, 2023Updated 2 years ago
Alternatives and similar repositories for tinytimmy
Users that are interested in tinytimmy are comparing it to the libraries listed below
Sorting:
- how to unit test your PySpark code☆29Mar 26, 2021Updated 4 years ago
- Materialize plugin for dbt☆12Jan 25, 2021Updated 5 years ago
- Example orchestration pipeline for Fivetran + dbt managed by Airflow☆22Feb 18, 2021Updated 5 years ago
- Glue VSCode devcontainer setup☆14Jan 31, 2023Updated 3 years ago
- Delta Lake helper methods. No Spark dependency.☆22Jan 19, 2026Updated 2 months ago
- Python test runner built in Rust☆19Feb 20, 2026Updated last month
- A toolkit for managing data access policies as code☆13Apr 18, 2024Updated last year
- CloudGlance helps you Navigate your AWS Environments with ease. It is a single pane of glass for all your AWS credentials and bastion hos…☆13Feb 27, 2025Updated last year
- A Rust based data/CSV/Parquet file generator☆65Mar 3, 2025Updated last year
- A terraform module that deploys Dagster to Azure.☆11May 10, 2021Updated 4 years ago
- This checklist aims to be an exhaustive list of all elements you should consider when using Amazon Redshift.☆15Sep 21, 2020Updated 5 years ago
- A compilation of main commands for scikit-learn with examples☆11Apr 4, 2023Updated 2 years ago
- Template for Data Engineering and Data Pipeline projects☆117Jan 1, 2023Updated 3 years ago
- ☆12Aug 28, 2024Updated last year
- ☆159Feb 25, 2026Updated 3 weeks ago
- For sharing my adventofcode.com solutions☆14Feb 1, 2025Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆227Mar 11, 2026Updated last week
- Catalog of datasets related underserved communities. An intermediate between community organizers and data scientists.☆47Jan 31, 2018Updated 8 years ago
- Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principle…☆124Mar 31, 2025Updated 11 months ago
- ☆31Jan 13, 2026Updated 2 months ago
- ☆10Mar 8, 2022Updated 4 years ago
- ☆10Jan 28, 2025Updated last year
- ☆33Apr 16, 2024Updated last year
- ☆13Jul 8, 2024Updated last year
- Compare DuckDB, Polars and Pandas for generating an artificial dataset of persons and companies☆35Aug 31, 2023Updated 2 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆261Dec 5, 2023Updated 2 years ago
- Yet Another (Spark) ETL Framework☆21Oct 21, 2023Updated 2 years ago
- Events about the open source data stack☆13Apr 16, 2022Updated 3 years ago
- Project for "Data pipeline design patterns" blog.☆51Aug 6, 2024Updated last year
- A tutorial for using Hadoop with Python and Hive☆10May 26, 2015Updated 10 years ago
- EpochFS is a versioned cloud file system with git-like branching, transaction support.☆17Mar 11, 2026Updated last week
- Companion repository to the ETL & ELT Pipelines with Apache Airflow® eBook☆39Feb 16, 2026Updated last month
- ☆17Nov 25, 2024Updated last year
- A script that gets data from the Twitter real-time API, passes it to a message-queue (e.g. RabbitMQ) and stores tweets into MongoDB☆11Apr 20, 2017Updated 8 years ago
- ☆13Jul 28, 2023Updated 2 years ago
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆357Mar 9, 2026Updated last week
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆225Apr 30, 2025Updated 10 months ago
- An ORM-Like interface for Google Cloud NoSQL Datastore☆13May 8, 2021Updated 4 years ago
- A dbt package for modelling dbt metadata. https://brooklyn-data.github.io/dbt_artifacts☆390Mar 12, 2026Updated last week