MadryLab / modelcomponents
Decomposing and Editing Predictions by Modeling Model Computation
☆131Updated 7 months ago
Alternatives and similar repositories for modelcomponents:
Users that are interested in modelcomponents are comparing it to the libraries listed below
- A curated list of Model Merging methods.☆89Updated 4 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆92Updated 4 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆70Updated 5 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆209Updated 7 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆97Updated 2 weeks ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆21Updated last month
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆22Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆133Updated 10 months ago
- ☆159Updated 11 months ago
- ☆69Updated 5 months ago
- A brief and partial summary of RLHF algorithms.☆89Updated last month
- ☆93Updated 6 months ago
- Reading list for research topics in state-space models☆253Updated 3 weeks ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆188Updated 2 weeks ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆182Updated 7 months ago
- OpenReivew Submission Visualization (ICLR 2024/2025)☆148Updated 3 months ago
- ☆82Updated 11 months ago
- ☆22Updated last year
- Graph Diffusion Policy Optimization☆29Updated 10 months ago
- Official source code for "Graph Neural Networks for Learning Equivariant Representations of Neural Networks". In ICLR 2024 (oral).☆76Updated 5 months ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆155Updated 5 months ago
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆41Updated this week
- ☆21Updated last week
- Using sparse coding to find distributed representations used by neural networks.☆207Updated last year
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆36Updated 3 months ago
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆136Updated 2 weeks ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆66Updated last month
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆58Updated 3 months ago
- Optimal Transport in the Big Data Era☆98Updated 2 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆180Updated 2 months ago