Modalities / modalitiesLinks
Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.
☆84Updated last week
Alternatives and similar repositories for modalities
Users that are interested in modalities are comparing it to the libraries listed below
Sorting:
- nanoGPT-like codebase for LLM training☆107Updated 4 months ago
- Efficient LLM inference on Slurm clusters using vLLM.☆77Updated this week
- ☆166Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (µP)☆82Updated 3 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆79Updated 3 years ago
- ☆31Updated 5 months ago
- Supercharge huggingface transformers with model parallelism.☆77Updated last month
- ☆82Updated last year
- Sparse and discrete interpretability tool for neural networks☆63Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆160Updated 2 months ago
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆128Updated 3 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- A library to create and manage configuration files, especially for machine learning projects.☆79Updated 3 years ago
- Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).☆85Updated last month
- [NeurIPS 2023] Learning Transformer Programs☆162Updated last year
- Hierarchical Attention Transformers (HAT)☆58Updated last year
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆115Updated last year
- ☆35Updated 9 months ago
- PyTorch library for Active Fine-Tuning☆91Updated last week
- Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale, TACL (2022)☆130Updated 3 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆129Updated 9 months ago
- Understand and test language model architectures on synthetic tasks.☆225Updated 2 months ago
- A Jax-based library for building transformers, includes implementations of GPT, Gemma, LlaMa, Mixtral, Whisper, SWin, ViT and more.☆291Updated last year
- Utilities for the HuggingFace transformers library☆71Updated 2 years ago
- Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023☆137Updated last year
- A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.☆105Updated last year
- ☆54Updated 2 years ago
- ☆38Updated last year
- Interpretating the latent space representations of attention head outputs for LLMs☆34Updated last year
- Running Jax in PyTorch Lightning☆113Updated 9 months ago