MadryLab / modelcomponents
Decomposing and Editing Predictions by Modeling Model Computation
☆124Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for modelcomponents
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆78Updated 2 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆200Updated 5 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆123Updated 8 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆62Updated 3 months ago
- Reading list for research topics in state-space models☆241Updated 2 weeks ago
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆103Updated 3 months ago
- Optimal Transport in the Big Data Era☆93Updated 2 weeks ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆169Updated last week
- Implementation of Griffin from the paper: "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆49Updated last week
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆120Updated last year
- Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆366Updated 3 months ago
- Some preliminary explorations of Mamba's context scaling.☆191Updated 9 months ago
- ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).☆179Updated this week
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆174Updated this week
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆61Updated last week
- Awesome list of papers that extend Mamba to various applications.☆128Updated 2 months ago
- A curated list of Model Merging methods.☆83Updated 2 months ago
- PyTorch Implementation of Jamba: "Jamba: A Hybrid Transformer-Mamba Language Model"☆137Updated last week
- ☆76Updated 9 months ago
- ☆46Updated last month
- PyTorch implementation of Structured State Space for Sequence Modeling (S4), based on Annotated S4.☆70Updated 8 months ago
- ☆96Updated last week
- Griffin MQA + Hawk Linear RNN Hybrid☆85Updated 6 months ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆88Updated 10 months ago
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆31Updated 2 months ago
- ☆175Updated this week
- A State-Space Model with Rational Transfer Function Representation.☆70Updated 6 months ago
- A curated list for awesome discrete diffusion models resources.☆67Updated last week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- Collection of papers on state-space models☆556Updated 2 weeks ago