Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
☆11,420Jan 13, 2026Updated 3 months ago
Alternatives and similar repositories for cleanlab
Users that are interested in cleanlab are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.☆13,297Updated this week
- A curated list of resources for Learning with Noisy Labels☆2,719May 3, 2025Updated 11 months ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,931Updated this week
- Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML va…☆4,001Dec 28, 2025Updated 3 months ago
- Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.☆31,042Apr 7, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DSPy: The framework for programming—not prompting—language models☆33,649Updated this week
- A system for quickly generating training data with weak supervision☆5,948Updated this week
- Label Studio is a multi-type data labeling and annotation tool with standardized output format☆26,997Apr 10, 2026Updated last week
- The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --…☆36,637Apr 9, 2026Updated last week
- A game theoretic approach to explain the output of any machine learning model.☆25,286Updated this week
- Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. Fro…☆7,385Mar 30, 2026Updated 2 weeks ago
- A library for efficient similarity search and clustering of dense vectors.☆39,720Updated this week
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆22,086Jan 23, 2026Updated 2 months ago
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆159,455Updated this week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 🦉 Data Versioning and ML Experiments☆15,524Apr 7, 2026Updated last week
- State-of-the-Art Text Embeddings☆18,534Apr 10, 2026Updated last week
- Algorithms for outlier, adversarial and drift detection☆2,512Dec 11, 2025Updated 4 months ago
- 🌊 Online machine learning in Python☆5,791Updated this week
- Open-source AI orchestration framework for building context-engineered, production-ready LLM applications. Design modular pipelines and a…☆24,815Updated this week
- A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning☆20,374Updated this week
- The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, a…☆25,280Updated this week
- Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125☆15,290Jun 25, 2025Updated 9 months ago
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆32,201Sep 30, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.☆42,029Updated this week
- 🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.☆73,374Mar 11, 2026Updated last month
- Structured Outputs☆13,657Mar 26, 2026Updated 3 weeks ago
- LlamaIndex is the leading document agent and OCR platform☆48,601Updated this week
- 💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows☆12,395Apr 8, 2026Updated last week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.☆42,054Apr 10, 2026Updated last week
- Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!☆42,340Updated this week
- A hyperparameter optimization framework☆13,905Apr 8, 2026Updated last week
- An open-source, low-code machine learning library in Python☆9,736Apr 21, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ZenML 🙏: One AI Platform from Pipelines to Agents. https://zenml.io.☆5,334Updated this week
- Always know what to expect from your data.☆11,391Updated this week
- The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!☆8,580Updated this week
- ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling …☆6,615Updated this week
- A playbook for systematically maximizing the performance of deep learning models.☆30,020Jun 18, 2024Updated last year
- Model interpretability and understanding for PyTorch☆5,600Updated this week
- Doubt your data, find bad labels.☆516Jul 15, 2024Updated last year