MadryLab / modelcomponentsLinks
Decomposing and Editing Predictions by Modeling Model Computation
☆138Updated last year
Alternatives and similar repositories for modelcomponents
Users that are interested in modelcomponents are comparing it to the libraries listed below
Sorting:
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆108Updated 9 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆137Updated 5 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆223Updated last year
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆82Updated this week
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆221Updated last month
- One-shot Entropy Minimization☆149Updated 2 weeks ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆75Updated 6 months ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆37Updated 7 months ago
- ☆183Updated last year
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆79Updated 5 months ago
- Collection of Reverse Engineering in Large Model☆32Updated 5 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆160Updated 3 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆95Updated 3 weeks ago
- A curated list of Model Merging methods.☆92Updated 9 months ago
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆37Updated 7 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆67Updated 11 months ago
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆147Updated 5 months ago
- ☆131Updated 7 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated last month
- ☆28Updated last year
- ☆95Updated 4 months ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆175Updated this week
- Code accompanying the paper "Massive Activations in Large Language Models"☆163Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆46Updated 3 weeks ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆190Updated last year
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆185Updated last week
- Using sparse coding to find distributed representations used by neural networks.☆255Updated last year
- ViT Prisma is a mechanistic interpretability library for Vision and Video Transformers (ViTs).☆275Updated 2 weeks ago
- [NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training☆35Updated 2 months ago