kwanUm / awesome-data-qualityLinks
Curated list of tools and frameworks assisting in monitoring data quality
☆12Updated 3 years ago
Alternatives and similar repositories for awesome-data-quality
Users that are interested in awesome-data-quality are comparing it to the libraries listed below
Sorting:
- Entity resolution for everyone. Minimal. No dependencies.☆9Updated 2 months ago
- Progzee is a Python library for simplifying IP proxy usage in HTTP requests.☆16Updated 3 months ago
- List of entity resolution software and resources.☆75Updated 4 months ago
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB worker…☆18Updated last year
- Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications☆98Updated 8 months ago
- HyPSTER - HyperParameter optimization on STERoids☆48Updated 7 months ago
- DuckDB Community Extension to prompt LLMs from SQL☆49Updated 5 months ago
- S3 vector database for LLM Agents and RAG.☆43Updated last year
- dpq is an open-source python library that makes prompt-based data transformations and feature engineering easy☆24Updated last year
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆153Updated this week
- ☆37Updated 2 months ago
- The Modern Data Stack in a Python package☆49Updated last year
- Read infrastructure data from your cloud ☁️ and export it to a SQL database 📋.☆33Updated last year
- ☆63Updated this week
- Pushdown compute from Snowflake to DuckDB running on your infrastructure☆185Updated this week
- A library to find and visualise the most interesting slices in multidimensional data☆108Updated 2 months ago
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution …☆34Updated last year
- Tutorials, templates for running glassflow pipelines☆30Updated 4 months ago
- Serverless for data practitioners. The fastest ⚡️ way to run your code in the cloud. Effortlessly run scripts, functions, and Jupyter not…☆39Updated last year
- Contribute to dlt verified sources 🔥☆87Updated last week
- An example of a RAG backend plus UI☆51Updated 6 months ago
- ☆11Updated 4 months ago
- Heimdall is a data orchestration and job execution platform☆52Updated this week
- Generate dbt yml files using the CUE language☆11Updated last year
- simplifies the process of creating and managing LLM workflows.☆104Updated 8 months ago
- The AI-powered CLI Assistant☆27Updated last year
- 🛡️ Managed isolated environments for Python☆94Updated last week
- Malloy Composer is a simple application to build dashboards or run ad-hoc queries using an existing Malloy model☆67Updated 3 weeks ago
- Hyperparam local dataset viewer☆22Updated this week
- Python wrapper for the Sling CLI tool☆53Updated last week