MadryLab / modelcomponents
Decomposing and Editing Predictions by Modeling Model Computation
☆138Updated 9 months ago
Alternatives and similar repositories for modelcomponents:
Users that are interested in modelcomponents are comparing it to the libraries listed below
- ☆74Updated 7 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆138Updated 3 weeks ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆102Updated 6 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆124Updated 2 months ago
- Collection of Reverse Engineering in Large Model☆32Updated 2 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆76Updated 3 weeks ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆37Updated 4 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆216Updated 10 months ago
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆139Updated 3 months ago
- A curated list of Model Merging methods.☆91Updated 6 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 3 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆73Updated 4 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆75Updated 6 months ago
- Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling☆190Updated 2 months ago
- Holistic evaluation of multimodal foundation models☆45Updated 7 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆77Updated this week
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆72Updated 2 months ago
- A More Fair and Comprehensive Comparison between KAN and MLP☆164Updated 7 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆161Updated 3 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆209Updated 3 weeks ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆25Updated last year
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆36Updated this week
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆122Updated 7 months ago
- Reading list for research topics in state-space models☆269Updated 2 months ago
- A State-Space Model with Rational Transfer Function Representation.☆78Updated 10 months ago
- ☆261Updated last month
- One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation☆38Updated 5 months ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 2 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆150Updated last year
- When it comes to optimizers, it's always better to be safe than sorry☆214Updated last week