MingyuJ666 / Rope_with_LLMLinks

[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concentrated in low-frequency dimensions across different attention heads exclusively in attention queries (Q) and keys (K) while absent in values (V).

☆75

Alternatives and similar repositories for Rope_with_LLM

Users that are interested in Rope_with_LLM are comparing it to the libraries listed below

Sorting:

horseee / CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆81Updated 5 months ago
GATECH-EIC / ACT
[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…
☆40Updated last year
fscdc / Awesome-Efficient-Reasoning-Models
[arXiv 2025] Efficient Reasoning Models: A Survey
☆247Updated 2 weeks ago
StarDewXXX / O1-Pruner
Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
☆86Updated 5 months ago
Joshua-Ren / Learning_dynamics_LLM
☆155Updated 2 months ago
LINs-lab / DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
☆121Updated 3 weeks ago
hemingkx / TokenSkip
TokenSkip: Controllable Chain-of-Thought Compression in LLMs
☆171Updated last month
ChnQ / MI-Peaks
☆47Updated 3 weeks ago
SUSTechBruce / LOOK-M
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆98Updated 8 months ago
EIT-NLP / Awesome-Latent-CoT
This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.
☆142Updated 2 weeks ago
Dereck0602 / Awesome_Test_Time_LLMs
☆117Updated 4 months ago
AlphaLab-USTC / LRM-plans-CoT
The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"
☆20Updated last month
StarDewXXX / AdaR1
The official repository of paper "AdaR1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"
☆18Updated 3 months ago
yczhou001 / Awesome-Diffusion-LLM
paper list, tutorial, and nano code snippet for Diffusion Large Language Models.
☆96Updated last month
xuyige / SoftCoT
ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…
☆38Updated 2 months ago
zitian-gao / one-shot-em
One-shot Entropy Minimization
☆172Updated last month
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
Raibows / CREAM
Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.
☆24Updated 5 months ago
NastyMarcus / PhyX
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
☆43Updated this week
Blueyee / Efficient-CoT-LRMs
Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!
☆68Updated 4 months ago
waltonfuture / Diff-eRank
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆51Updated 2 months ago
OpenSparseLLMs / LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
☆86Updated 8 months ago
Alsace08 / Chain-of-Embedding
[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"
☆71Updated 7 months ago
PRIME-RL / Entropy-Mechanism-of-RL
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆275Updated 3 weeks ago
RyanLiu112 / GenPRM
Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆80Updated 2 months ago
ruixin31 / Spurious_Rewards
☆322Updated last week
ThreeSR / Awesome-Inference-Time-Scaling
Paper List of Inference/Test Time Scaling/Computing
☆286Updated last month
hkust-nlp / Laser
Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping
☆52Updated 2 months ago
QingyangZhang / Label-Free-RLVR
☆252Updated last month
ShadeCloak / ADORA
☆46Updated 3 months ago