Open-Source Software, Tutorials, and Research on Data-Centric AI π€
β349Apr 7, 2026Updated last week
Alternatives and similar repositories for awesome-data-centric-ai
Users that are interested in awesome-data-centric-ai are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Resources for Data Centric AIβ1,140Dec 13, 2023Updated 2 years ago
- Synthetic data generators for tabular and time-series dataβ1,619Mar 2, 2026Updated last month
- Curated list of open source tooling for data-centric AI on unstructured data.β733Nov 15, 2023Updated 2 years ago
- β30Feb 9, 2023Updated 3 years ago
- 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.β13,493Apr 10, 2026Updated last week
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Standardised Metrics and Methods for Synthetic Tabular Data Evaluationβ36Aug 14, 2024Updated last year
- Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)β18Mar 20, 2023Updated 3 years ago
- Open Source Data Annotation & Labeling Toolsβ693Apr 7, 2026Updated last week
- β16Jan 4, 2023Updated 3 years ago
- A copier template repository for a e2e batch ZenML MLOps pipeline.β11Dec 17, 2025Updated 4 months ago
- Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data β¦β11,434Jan 13, 2026Updated 3 months ago
- π² A curated list of MLOps projects, tools and resourcesβ187Apr 22, 2024Updated last year
- The active learning algorithm, mismatch-first farthest-traversal. Implementation and visualization.β12Dec 25, 2021Updated 4 years ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML vaβ¦β4,003Dec 28, 2025Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Modern development with Python in 2024β12Updated this week
- a catch-all repoβ11Dec 28, 2023Updated 2 years ago
- nannyml: post-deployment data science in pythonβ2,134Jul 12, 2025Updated 9 months ago
- β13May 12, 2023Updated 2 years ago
- A starter vault in Obsidian for both work and personal knowledge management, complete with seamless workflows.β15Nov 11, 2025Updated 5 months ago
- ML REPA Library: MLOps and ML Engineering Solutions for Successβ23Jun 26, 2023Updated 2 years ago
- ZenML π: One AI Platform from Pipelines to Agents. https://zenml.io.β5,334Updated this week
- β27Oct 13, 2022Updated 3 years ago
- An extensible AI agents frameworkβ18Jun 3, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Lab assignments for Introduction to Data-Centric AI, MIT IAP 2024 π©π½βπ»β480Feb 24, 2025Updated last year
- Pytest plugin for mocking BigQuery data from the python BigQuery client.β14Feb 6, 2023Updated 3 years ago
- HiPlot fetcher for experiments logged with MLflowβ14May 11, 2022Updated 3 years ago
- An open-source data logging library for machine learning models and data pipelines. π Provides visibility into data quality & model perfβ¦β2,813Jan 10, 2025Updated last year
- Covid-19 spread simulator with human mobility and intervention modeling.β19May 28, 2022Updated 3 years ago
- A curated list of awesome resources related to Semantic Searchπ and Semantic Similarity tasks.β363Dec 9, 2025Updated 4 months ago
- A curated list of awesome MLOps toolsβ5,094Mar 20, 2026Updated 3 weeks ago
- A curated list of references for MLOpsβ13,853Nov 21, 2024Updated last year
- This repository aims to map the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and bβ¦β1,424Mar 4, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A tool for quickly adding labels to unlabeled datasetsβ20Jan 12, 2024Updated 2 years ago
- SageMaker Experiments and DVCβ17Aug 22, 2022Updated 3 years ago
- ctt: CLI and pre-commit tool for testing copierβ23Mar 27, 2026Updated 3 weeks ago
- UCLANesl - NIST Differential Privacy Challenge (Match 3)β25May 30, 2019Updated 6 years ago
- A Python class for Reliability analysis including Monte Carlo and FORM methodsβ14Apr 24, 2025Updated 11 months ago
- SDNist: Benchmark data and evaluation tools for data synthesizers.β40Mar 26, 2026Updated 3 weeks ago
- Classification of multiple non-stationary time-series by using Continuous Wavelet Transformationβ17Feb 22, 2020Updated 6 years ago