Open-Source Software, Tutorials, and Research on Data-Centric AI π€
β349Apr 7, 2026Updated last month
Alternatives and similar repositories for awesome-data-centric-ai
Users that are interested in awesome-data-centric-ai are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Resources for Data Centric AIβ1,142Dec 13, 2023Updated 2 years ago
- Synthetic data generators for tabular and time-series dataβ1,636Apr 23, 2026Updated last month
- Curated list of open source tooling for data-centric AI on unstructured data.β731Nov 15, 2023Updated 2 years ago
- Data Quality assessment with one line of codeβ456Apr 23, 2026Updated last month
- β30Feb 9, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.β13,567Apr 22, 2026Updated last month
- Dvc + Streamlit = β€οΈβ40Oct 27, 2023Updated 2 years ago
- Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)β18Mar 20, 2023Updated 3 years ago
- Open Source Data Annotation & Labeling Toolsβ699Apr 7, 2026Updated last month
- A tool to generate stubs for Python packages using numpydoc-format docstrings and monkeytype tracesβ13Mar 14, 2024Updated 2 years ago
- NIST Collaborative Research Cycle on Synthetic Data. Learn about Synthetic Data week by week!β27Jul 13, 2023Updated 2 years ago
- A copier template repository for a e2e batch ZenML MLOps pipeline.β14Dec 17, 2025Updated 5 months ago
- Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data β¦β11,475Jan 13, 2026Updated 4 months ago
- β15Jul 16, 2014Updated 11 years ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- π² A curated list of MLOps projects, tools and resourcesβ188Apr 22, 2024Updated 2 years ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML vaβ¦β4,017Dec 28, 2025Updated 5 months ago
- Modern development with Python in 2024β12May 18, 2026Updated last week
- β13May 12, 2023Updated 3 years ago
- ML REPA Library: MLOps and ML Engineering Solutions for Successβ23Jun 26, 2023Updated 2 years ago
- ZenML π: One AI Platform from Pipelines to Agents. https://zenml.io.β5,423Updated this week
- Lab assignments for Introduction to Data-Centric AI, MIT IAP 2024 π©π½βπ»β480Feb 24, 2025Updated last year
- Hyperparameter tuning via uncertainty modelingβ51May 3, 2024Updated 2 years ago
- Pytest plugin for mocking BigQuery data from the python BigQuery client.β14Feb 6, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Make data labeling easy with Jupyter notebooks and Google Sheets!β27Sep 19, 2019Updated 6 years ago
- cleanpy is a CLI tool to remove caches and temporary files related to Python.β19Apr 13, 2026Updated last month
- An open-source data logging library for machine learning models and data pipelines. π Provides visibility into data quality & model perfβ¦β2,820Jan 10, 2025Updated last year
- β14Aug 18, 2023Updated 2 years ago
- A curated list of awesome resources related to Semantic Searchπ and Semantic Similarity tasks.β365Dec 9, 2025Updated 5 months ago
- A curated list of awesome MLOps toolsβ5,154Apr 29, 2026Updated last month
- A curated list of references for MLOpsβ13,916Nov 21, 2024Updated last year
- This repository aims to map the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and bβ¦β1,439May 22, 2026Updated last week
- Easy-to-use self-supervised representation learning for industrial AIβ25Feb 23, 2023Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- VSCode extension for ZenMLβ21May 19, 2026Updated last week
- SageMaker Experiments and DVCβ17Aug 22, 2022Updated 3 years ago
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learningβ20,553Updated this week
- UCLANesl - NIST Differential Privacy Challenge (Match 3)β25May 30, 2019Updated 7 years ago
- SDNist: Benchmark data and evaluation tools for data synthesizers.β41Mar 26, 2026Updated 2 months ago
- A CLI tool for managing project generator templates such as Cookiecutter and Copierβ23Aug 14, 2022Updated 3 years ago
- Evidently is ββan open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. Froβ¦β7,538May 2, 2026Updated 3 weeks ago