apple / ml-sigma-reparam
☆292Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for ml-sigma-reparam
- For optimization algorithm research and development.☆449Updated this week
- JAX implementation of the Llama 2 model☆210Updated 9 months ago
- WIP☆89Updated 3 months ago
- Annotated version of the Mamba paper☆457Updated 8 months ago
- Understand and test language model architectures on synthetic tasks.☆162Updated 6 months ago
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆109Updated last week
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆113Updated 7 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆84Updated last week
- ☆128Updated this week
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆151Updated 7 months ago
- Scalable neural net training via automatic normalization in the modular norm.☆121Updated 3 months ago
- ☆73Updated 4 months ago
- Helpful tools and examples for working with flex-attention☆469Updated 3 weeks ago
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- ☆161Updated last year
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆476Updated 3 weeks ago
- Universal Tensor Operations in Einstein-Inspired Notation for Python.☆328Updated last month
- A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.☆483Updated 3 weeks ago
- ☆197Updated 4 months ago
- ☆178Updated last week
- A puzzle to learn about prompting☆121Updated last year
- Legible, Scalable, Reproducible Foundation Models with Named Tensors and Jax☆516Updated this week
- Implementation of the Llama architecture with RLHF + Q-learning☆157Updated 10 months ago
- Efficient optimizers☆79Updated this week
- Best practices & guides on how to write distributed pytorch training code☆286Updated 2 weeks ago
- A simple library for scaling up JAX programs☆127Updated 2 weeks ago
- Fast bare-bones BPE for modern tokenizer training☆142Updated last month
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆119Updated 3 months ago
- A repository for log-time feedforward networks☆216Updated 7 months ago
- ☆139Updated 3 months ago