Modalities / modalitiesLinks
Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
☆93Updated this week
Alternatives and similar repositories for modalities
Users that are interested in modalities are comparing it to the libraries listed below
Sorting:
- Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).☆98Updated 5 months ago
- nanoGPT-like codebase for LLM training☆113Updated 3 months ago
- ☆82Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆87Updated 3 years ago
- Efficient LLM inference on Slurm clusters.☆90Updated last week
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆88Updated last year
- ☆62Updated last year
- ☆167Updated 2 years ago
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆46Updated 3 months ago
- Code for Zero-Shot Tokenizer Transfer☆142Updated last year
- Understand and test language model architectures on synthetic tasks.☆252Updated 3 weeks ago
- ☆152Updated 4 months ago
- SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs☆28Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆135Updated 3 years ago
- PyTorch library for Active Fine-Tuning☆96Updated 4 months ago
- Minimum Description Length probing for neural network representations☆20Updated last year
- ☆32Updated 2 weeks ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated 2 years ago
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆116Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆137Updated last year
- ☆112Updated 11 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆186Updated 2 weeks ago
- Supercharge huggingface transformers with model parallelism.☆77Updated 6 months ago
- ☆39Updated last year
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆78Updated last year
- Official implementation of "BERTs are Generative In-Context Learners"☆32Updated 10 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆150Updated 4 months ago
- Language models scale reliably with over-training and on downstream tasks☆99Updated last year
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆113Updated 3 months ago