languini-kitchen / languini-kitchen
The official Languini Kitchen repository
☆14Updated 6 months ago
Related projects ⓘ
Alternatives and complementary repositories for languini-kitchen
- ☆46Updated last month
- ☆50Updated 6 months ago
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆57Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆52Updated last month
- ☆45Updated 9 months ago
- Transformers with doubly stochastic attention☆40Updated 2 years ago
- ☆31Updated 10 months ago
- ☆16Updated 3 months ago
- ☆54Updated 2 years ago
- Pytorch implementation of preconditioned stochastic gradient descent (affine group preconditioner, low-rank approximation preconditioner …☆128Updated last month
- The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We s…☆66Updated last year
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- ☆36Updated 10 months ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆73Updated 3 months ago
- Griffin MQA + Hawk Linear RNN Hybrid☆85Updated 6 months ago
- STABILIZING GRADIENTS FOR DEEP NEURAL NETWORKS VIA EFFICIENT SVD PARAMETERIZATION☆16Updated 6 years ago
- ☆77Updated 3 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated 3 weeks ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆95Updated last year
- RWKV model implementation☆38Updated last year
- ☆129Updated last week
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆36Updated 2 years ago
- ☆22Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Sequence Modeling with Structured State Spaces☆60Updated 2 years ago
- NanoGPT-like codebase for LLM training☆75Updated this week
- The Energy Transformer block, in JAX☆53Updated 11 months ago
- ☆29Updated this week
- [ICML 2024] SIRFShampoo: Structured inverse- and root-free Shampoo in PyTorch (https://arxiv.org/abs/2402.03496)☆13Updated 2 weeks ago
- Automatically take good care of your preemptible TPUs☆32Updated last year