Modalities / modalitiesLinks

Modalities, a PyTorch-native framework for distributed and reproducible foundation model training.

☆90

Alternatives and similar repositories for modalities

Users that are interested in modalities are comparing it to the libraries listed below

Sorting:

epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆110Updated this week
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆129Updated 3 years ago
tommasomncttn / mergenetic
Flexible library for merging large language models (LLMs) via evolutionary optimization (ACL 2025 Demo).
☆91Updated 2 months ago
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆64Updated last year
ludwigwinkler / JaxLightning
Running Jax in PyTorch Lightning
☆113Updated 10 months ago
ml-jku / SDLG
SDLG is an efficient method to accurately estimate aleatoric semantic uncertainty in LLMs
☆26Updated last year
VectorInstitute / vector-inference
Efficient LLM inference on Slurm clusters using vLLM.
☆81Updated last week
vertaix / Vendi-Score
☆136Updated 3 months ago
srush / do-we-need-attention
☆166Updated 2 years ago
bartbussmann / BatchTopK
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
☆51Updated 3 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆170Updated 4 months ago
jxiw / BiGS
Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …
☆114Updated last year
shikaiqiu / compute-better-spent
☆60Updated last year
AntreasAntoniou / kubejobs
☆31Updated 6 months ago
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆86Updated 3 years ago
epfml / schedules-and-scaling
Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"
☆84Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆132Updated 10 months ago
captain-pool / hydra-wandb-sweeper
WandB sweeps integration with Hydra sweeper
☆50Updated last year
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆234Updated last month
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆52Updated 2 years ago
pietrobarbiero / pytorch_explain
PyTorch Explain: Interpretable Deep Learning in Python.
☆163Updated last year
google-deepmind / mishax
☆142Updated last month
KihoPark / LLM_Categorical_Hierarchical_Representations
☆111Updated 8 months ago
ethancaballero / broken_neural_scaling_laws
Code Release for "Broken Neural Scaling Laws" (BNSL) paper
☆59Updated 2 years ago
guy-dar / embedding-space
☆55Updated 2 years ago
meta-pytorch / torchfix
TorchFix - a linter for PyTorch-using code with autofix support
☆148Updated 2 months ago
TorchDR / TorchDR
TorchDR - PyTorch Dimensionality Reduction
☆167Updated last month
luyug / magix
Supercharge huggingface transformers with model parallelism.
☆77Updated 3 months ago
google-research / optformer
☆230Updated 2 weeks ago
apoorvkh / academic-pretraining
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
☆147Updated last month