MadryLab / modelcomponents
Decomposing and Editing Predictions by Modeling Model Computation
☆138Updated 10 months ago
Alternatives and similar repositories for modelcomponents:
Users that are interested in modelcomponents are comparing it to the libraries listed below
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆105Updated 7 months ago
- Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆128Updated 3 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆145Updated last month
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆28Updated last year
- A curated list of Model Merging methods.☆91Updated 7 months ago
- Optimal Transport in the Big Data Era☆106Updated 5 months ago
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆76Updated 7 months ago
- ☆175Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆45Updated 5 months ago
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆214Updated last week
- ☆78Updated 8 months ago
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆143Updated 3 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 4 months ago
- Awesome list of papers that extend Mamba to various applications.☆132Updated 3 weeks ago
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆37Updated 5 months ago
- ☆52Updated this week
- ☆25Updated last year
- Code accompanying the paper "Massive Activations in Large Language Models"☆156Updated last year
- [EMNLP 2023 Main] Sparse Low-rank Adaptation of Pre-trained Language Models☆75Updated last year
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆67Updated 2 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆92Updated last month
- ☆104Updated 5 months ago
- 👋 Code for : "CRAFT: Concept Recursive Activation FacTorization for Explainability" (CVPR 2023)☆62Updated last year
- ☆522Updated 2 weeks ago
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆72Updated 3 months ago
- A brief and partial summary of RLHF algorithms.☆127Updated last month
- [NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging☆57Updated last month
- ☆53Updated 5 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆50Updated 11 months ago
- Awesome Learn From Model Beyond Fine-Tuning: A Survey☆62Updated 4 months ago