Open-Source Software, Tutorials, and Research on Data-Centric AI π€
β345Feb 10, 2026Updated 3 weeks ago
Alternatives and similar repositories for awesome-data-centric-ai
Users that are interested in awesome-data-centric-ai are comparing it to the libraries listed below
Sorting:
- Tutorials for YData's Fabric platformβ35May 12, 2025Updated 9 months ago
- Curated list of open source tooling for data-centric AI on unstructured data.β734Nov 15, 2023Updated 2 years ago
- Synthetic data generators for tabular and time-series dataβ1,612Updated this week
- Data Quality assessment with one line of codeβ453Feb 27, 2026Updated last week
- β12May 12, 2023Updated 2 years ago
- β15Jul 16, 2014Updated 11 years ago
- π² A curated list of MLOps projects, tools and resourcesβ187Apr 22, 2024Updated last year
- β30Feb 9, 2023Updated 3 years ago
- Open Source Data Annotation & Labeling Toolsβ683Oct 27, 2025Updated 4 months ago
- Awesome list for data journalists and future data journalistsβ206Feb 26, 2026Updated last week
- ZenML π: One AI Platform from Pipelines to Agents. https://zenml.io.β5,245Updated this week
- nannyml: post-deployment data science in pythonβ2,126Jul 12, 2025Updated 7 months ago
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML vaβ¦β3,982Dec 28, 2025Updated 2 months ago
- A curated list of awesome resources related to Semantic Searchπ and Semantic Similarity tasks.β361Dec 9, 2025Updated 2 months ago
- A curated list of research, applications and projects built using the H2O Machine Learning platformβ391May 18, 2023Updated 2 years ago
- Dvc + Streamlit = β€οΈβ40Oct 27, 2023Updated 2 years ago
- A list of tools for annotating data, managing annotations, etc.β609Aug 1, 2024Updated last year
- A starter vault in Obsidian for both work and personal knowledge management, complete with seamless workflows.β15Nov 11, 2025Updated 3 months ago
- β14Aug 18, 2023Updated 2 years ago
- A copier template repository for a e2e batch ZenML MLOps pipeline.β11Dec 17, 2025Updated 2 months ago
- β27Oct 13, 2022Updated 3 years ago
- Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data β¦β11,346Jan 13, 2026Updated last month
- An open-source data logging library for machine learning models and data pipelines. π Provides visibility into data quality & model perfβ¦β2,800Jan 10, 2025Updated last year
- A curated list of awesome MLOps toolsβ5,043Feb 23, 2026Updated last week
- a catch-all repoβ11Dec 28, 2023Updated 2 years ago
- A curated list of technology companies, resources, and tools in the agricultural field.β55May 30, 2018Updated 7 years ago
- β12May 12, 2020Updated 5 years ago
- A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.β19Feb 6, 2025Updated last year
- Modern development with Python in 2024β12Feb 23, 2026Updated last week
- π§« A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)β417May 26, 2024Updated last year
- Standardised Metrics and Methods for Synthetic Tabular Data Evaluationβ35Aug 14, 2024Updated last year
- Find primary sources online and learn how to research history digitally.β303Feb 20, 2026Updated 2 weeks ago
- A minimalistic boiler plate code for training pytorch modelsβ13Jun 10, 2024Updated last year
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learningβ20,200Updated this week
- Evidently is ββan open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. Froβ¦β7,272Feb 27, 2026Updated last week
- This repository aims to map the ecosystem of artificial intelligence guidelines, principles, codes of ethics, standards, regulation and bβ¦β1,408Feb 22, 2026Updated last week
- Clone of chatgpt built with Bytewax, Streamlit and NATSβ14Mar 2, 2023Updated 3 years ago
- Curated list of awesome browser extensions that protect your privacyβ58Mar 31, 2019Updated 6 years ago
- An object-oriented Agent Based Model for land use/land cover change and multisector dynamicsβ18Mar 28, 2024Updated last year