neilwen987 / CSR_Adaptive_RepLinks
Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
☆110Updated 2 weeks ago
Alternatives and similar repositories for CSR_Adaptive_Rep
Users that are interested in CSR_Adaptive_Rep are comparing it to the libraries listed below
Sorting:
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆29Updated 8 months ago
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆60Updated 9 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated 2 months ago
- ☆30Updated 3 months ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 9 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆33Updated 4 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆144Updated this week
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆74Updated 9 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆42Updated 8 months ago
- AnchorAttention: Improved attention for LLMs long-context training☆208Updated 6 months ago
- ☆43Updated 5 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆27Updated 2 months ago
- Code for Heima☆49Updated 2 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆99Updated last week
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆64Updated last month
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- Official implementation of Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs (ICLR 2024).☆42Updated 11 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆59Updated 3 months ago
- Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆99Updated last week
- Exploration of automated dataset selection approaches at large scales.☆47Updated 4 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆49Updated last month
- Repo for "Z1: Efficient Test-time Scaling with Code"☆63Updated 3 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆27Updated 4 months ago
- ☆18Updated 6 months ago
- ☆82Updated 10 months ago
- ☆51Updated last week
- Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆43Updated last month
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆63Updated last month
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆42Updated last year
- ☆113Updated 4 months ago