mlcommons / croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
β450Updated last week
Related projects β
Alternatives and complementary repositories for croissant
- My personal frontpage appβ78Updated this week
- Explore and interpret large embeddings in your browser with interactive visualization! πβ424Updated 9 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.β257Updated 4 months ago
- Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.β722Updated this week
- Scalable data pre processing and curation toolkit for LLMsβ615Updated this week
- AI Data Management & Evaluation Platformβ215Updated last year
- Transform datasets at scale. Optimize datasets for fast AI model training.β368Updated this week
- Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.β534Updated 5 months ago
- Let's build better datasets, together!β206Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ811Updated this week
- skops is a Python library helping you share your scikit-learn based models and put them in productionβ451Updated this week
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayβ252Updated last year
- An interactive HTML pretty-printer for machine learning research in IPython notebooks.β337Updated 3 weeks ago
- Inspect: A framework for large language model evaluationsβ624Updated this week
- End-to-end Generative Optimization for AI Agentsβ342Updated this week
- Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)β340Updated last month
- π€ A PyTorch library of curated Transformer models and their composable componentsβ866Updated 7 months ago
- Easily embed, cluster and semantically label text datasetsβ463Updated 7 months ago
- Manage scalable open LLM inference endpoints in Slurm clustersβ237Updated 4 months ago
- A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evaluβ¦β161Updated last month
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.β151Updated 7 months ago
- Create powerful Hydra applications without the yaml files and boilerplate code.β339Updated last week
- utilities for decoding deep representations (like sentence embeddings) back to textβ737Updated 2 months ago
- TorchFix - a linter for PyTorch-using code with autofix supportβ103Updated last week
- β199Updated this week
- Automatically evaluate your LLMs in Google Colabβ559Updated 6 months ago
- large population modelsβ214Updated 3 weeks ago
- In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.β386Updated 9 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,048Updated this week