mlcommons / croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
☆556Updated this week
Alternatives and similar repositories for croissant:
Users that are interested in croissant are comparing it to the libraries listed below
- Inspect: A framework for large language model evaluations☆847Updated this week
- Website for hosting the Open Foundation Models Cheat Sheet.☆265Updated 3 weeks ago
- Transform datasets at scale. Optimize datasets for fast AI model training.☆436Updated this week
- AI Data Management & Evaluation Platform☆215Updated last year
- 🤖 A PyTorch library of curated Transformer models and their composable components☆884Updated 11 months ago
- git extension for {collaborative, communal, continual} model development☆209Updated 4 months ago
- Let's build better datasets, together!☆257Updated 3 months ago
- skops is a Python library helping you share your scikit-learn based models and put them in production☆469Updated 3 weeks ago
- An interactive HTML pretty-printer for machine learning research in IPython notebooks.☆399Updated 2 weeks ago
- 🧠🔗 Graph-Based Programmable Neuro-Symbolic LM Framework - a production-first LM framework built with decade old Deep Learning best prac…☆141Updated last week
- Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.☆552Updated 10 months ago
- Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.☆852Updated this week
- An example starter repo for Python projects☆247Updated last week
- ☆221Updated this week
- Interpretability for sequence generation models 🐛 🔍☆410Updated 4 months ago
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆102Updated this week
- Creative interactive views of any dataset.☆837Updated 3 months ago
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆562Updated this week
- Late Interaction Models Training & Retrieval☆264Updated last week
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- ☆217Updated last month
- ☆124Updated last week
- A Python package housing a collection of deep-learning multi-modal data fusion method pipelines! From data loading, to training, to evalu…☆178Updated this week
- Gain clues from clustering!☆313Updated 8 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆156Updated 11 months ago
- Create powerful Hydra applications without the yaml files and boilerplate code.☆367Updated this week
- The Data Cards Playbook helps dataset producers and publishers adopt a people-centered approach to transparency in dataset documentation.☆178Updated 10 months ago
- The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to s…☆697Updated last week
- SUQL: Conversational Search over Structured and Unstructured Data with LLMs☆258Updated last week
- Textbook on reinforcement learning from human feedback☆505Updated this week