kwanUm / awesome-data-quality
Curated list of tools and frameworks assisting in monitoring data quality
☆11Updated 2 years ago
Related projects: ⓘ
- Entity resolution for everyone. Minimal. No dependencies.☆10Updated last month
- Metafeature Extraction for Unstructured Data☆100Updated last month
- An end-to-end LLM reference implementation providing a Q&A interface for Airflow and Astronomer☆192Updated 2 months ago
- Terraform templates for deploying mage-ai to AWS, GCP and Azure☆39Updated 3 months ago
- ☆82Updated this week
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆68Updated last year
- Airbyte made simple (no UI, no database, no cluster)☆140Updated 2 weeks ago
- Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.☆32Updated 10 months ago
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆73Updated this week
- A playground for running duckdb as a stateless query engine over a data lake☆156Updated 8 months ago
- Data Tools Subjective List☆80Updated last year
- Define, govern, and model event data for warehouse-first product analytics.☆80Updated 2 months ago
- Graphsignal Tracer for Python☆202Updated last month
- Test data management tool for any data source, batch or real-time☆35Updated last week
- Time series forecasting with DuckDB and Evidence☆33Updated 2 months ago
- ☆18Updated 3 months ago
- S3 vector database for LLM Agents and RAG.☆28Updated last year
- 🏃♀️ Minimalist alternative to dbt☆203Updated last week
- A Python framework for defining and querying BI models in your data warehouse☆157Updated 5 months ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆66Updated 2 years ago
- Houston orchestration API. callhouston.io☆51Updated 7 months ago
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆187Updated this week
- Example Dagster Cloud code for the Hooli Data Engineering organization.☆73Updated 2 weeks ago
- Playground for using large language models into the Modern Data Stack for entity matching☆105Updated last year
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆158Updated last month
- DagsHub client libraries☆91Updated this week
- Anomstack - Painless open source anomaly detection for your metrics 📈📉🚀☆86Updated 5 months ago
- Sample configuration to deploy a modern data platform.☆84Updated 2 years ago
- A curated list of awesome DataOps tools☆139Updated 3 months ago
- Sample project that use Dagster, dbt, DuckDB and Dash to visualize car and motorcycle Spanish market☆52Updated last year