neilwen987 / CSR_Adaptive_RepLinks
Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
☆116Updated last month
Alternatives and similar repositories for CSR_Adaptive_Rep
Users that are interested in CSR_Adaptive_Rep are comparing it to the libraries listed below
Sorting:
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆146Updated 3 weeks ago
- ☆83Updated 11 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆29Updated 3 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆47Updated 3 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 9 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆51Updated 2 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆30Updated 9 months ago
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆76Updated 9 months ago
- ☆117Updated 4 months ago
- AnchorAttention: Improved attention for LLMs long-context training☆212Updated 6 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆62Updated 4 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆60Updated 10 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆103Updated this week
- ☆44Updated 6 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆32Updated 3 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆107Updated last month
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆29Updated 5 months ago
- Code for Heima☆51Updated 3 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆171Updated last week
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆70Updated last month
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆178Updated last month
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆48Updated 3 months ago
- ☆30Updated 5 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆83Updated last month
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 9 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆68Updated 2 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 10 months ago
- ☆185Updated last year