GATECH-EIC / ACTLinks

[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

☆40

Alternatives and similar repositories for ACT

Users that are interested in ACT are comparing it to the libraries listed below

Sorting:

horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆81Updated 5 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆63Updated 3 weeks ago
Joshua-Ren / Learning_dynamics_LLM
☆155Updated 2 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆103Updated 3 weeks ago
luka-group / vlm-knowledge-conflict
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆42Updated 9 months ago
hkust-nlp / Laser
Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
☆52Updated 2 months ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
MingyuJ666 / Rope_with_LLM
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…
☆75Updated last month
harveyhuang18 / EMR_Merging
[NeurIPS 2024 Spotlight] EMR-Merging: Tuning-Free High-Performance Model Merging
☆62Updated 5 months ago
shiqichen17 / VLM_Merging
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆68Updated 2 months ago
SUSTechBruce / LOOK-M
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆98Updated 8 months ago
ShadeCloak / ADORA
☆46Updated 3 months ago
YiyangZhou / CSR
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
☆77Updated last year
MileBench / MileBench
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
☆36Updated last year
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆79Updated 8 months ago
shawnricecake / Heima
Code for Heima
☆51Updated 3 months ago
LINs-lab / DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
☆121Updated 3 weeks ago
OpenSparseLLMs / LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
☆86Updated 8 months ago
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆58Updated 7 months ago
NUS-TRAIL / NoisyRollout
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆83Updated 2 months ago
bigai-nlco / LatentSeek
Official Repository of LatentSeek
☆56Updated 2 months ago
ruixin31 / Spurious_Rewards
☆322Updated last week
ssmisya / PRMBench
[ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.
☆76Updated 5 months ago
ChnQ / MI-Peaks
☆47Updated 3 weeks ago
nickjiang2378 / vl-interp
Official Pytorch implementation of "Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations" (ICLR '25)
☆76Updated 2 months ago
UCSB-NLP-Chang / ThinkPrune
☆39Updated 3 months ago
sail-sg / ActivePRM
☆17Updated 3 months ago
yihedeng9 / STIC
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆69Updated last year
hemingkx / TokenSkip
TokenSkip: Controllable Chain-of-Thought Compression in LLMs
☆171Updated last month
r-three / smear
☆30Updated last year