neilwen987 / CSR_Adaptive_RepLinks

Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

☆116

Alternatives and similar repositories for CSR_Adaptive_Rep

Users that are interested in CSR_Adaptive_Rep are comparing it to the libraries listed below

Sorting:

EvolvingLMMs-Lab / multimodal-sae
[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
☆146Updated 3 weeks ago
RobertCsordas / moeut
☆83Updated 11 months ago
g-luo / vlm_cross_modal_reps
Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025
☆29Updated 3 months ago
microsoft / x-reasoner
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆47Updated 3 months ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 9 months ago
waltonfuture / Diff-eRank
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆51Updated 2 months ago
ShiZhengyan / InstructionModelling
[NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"
☆38Updated last year
wang-kee / LiNeS
Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"
☆30Updated 9 months ago
tianyi-lab / MoE-Embedding
Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
☆76Updated 9 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
haonan3 / AnchorContext
AnchorAttention: Improved attention for LLMs long-context training
☆212Updated 6 months ago
gstoica27 / KnOTS
Model Merging with SVD to Tie the KnOTS [ICLR 2025]
☆62Updated 4 months ago
ByungKwanLee / Phantom
[Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …
☆60Updated 10 months ago
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆103Updated this week
Weixin-Liang / Mixture-of-Mamba
☆44Updated 6 months ago
NuoJohnChen / JudgeLRM
JudgeLRM: Large Reasoning Models as a Judge
☆32Updated 3 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆107Updated last month
fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆32Updated last year
zzwjames / FailureLLMUnlearning
An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)
☆29Updated 5 months ago
shawnricecake / Heima
Code for Heima
☆51Updated 3 months ago
multimodal-art-projection / LatentCoT-Horizon
📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.
☆171Updated last week
dmis-lab / Monet
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
☆70Updated last month
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆178Updated last month
Wang-ML-Lab / multimodal-needle-in-a-haystack
[NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models
☆48Updated 3 months ago
VijayLingam95 / SVFT
☆30Updated 5 months ago
multimodal-interpretability / maia
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
☆83Updated last month
chuanyang-Zheng / DAPE
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆38Updated 9 months ago
shiqichen17 / VLM_Merging
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆68Updated 2 months ago
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆86Updated 10 months ago
prateeky2806 / ties-merging
☆185Updated last year