MingyuJ666 / Rope_with_LLMView external linksLinks
[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concentrated in low-frequency dimensions across different attention heads exclusively in attention queries (Q) and keys (K) while absent in values (V).
☆85Jun 20, 2025Updated 7 months ago
Alternatives and similar repositories for Rope_with_LLM
Users that are interested in Rope_with_LLM are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆23Mar 16, 2025Updated 10 months ago
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Mar 4, 2025Updated 11 months ago
- [NeurIPS 2025] Official repository for “FlowCut: Rethinking Redundancy via Information Flow for Efficient Vision-Language Models”☆28Dec 9, 2025Updated 2 months ago
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆27Jul 15, 2025Updated 7 months ago
- This repo contains the code for the paper "Understanding and Mitigating Hallucinations in Large Vision-Language Models via Modular Attrib…☆33Jul 14, 2025Updated 7 months ago
- ☆19Dec 3, 2025Updated 2 months ago
- ☆16Apr 7, 2025Updated 10 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆17Apr 16, 2025Updated 9 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆100Jan 26, 2026Updated 2 weeks ago
- ☆99Aug 11, 2025Updated 6 months ago
- ☆23May 8, 2025Updated 9 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆37Oct 1, 2025Updated 4 months ago
- ☆19Apr 3, 2025Updated 10 months ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆104Nov 9, 2024Updated last year
- [ICML2025] KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference☆26Jan 27, 2026Updated 2 weeks ago
- ☆16Jul 23, 2024Updated last year
- ☆27Nov 25, 2025Updated 2 months ago
- This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box☆18Dec 19, 2024Updated last year
- [CVPR 2025] VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification☆48Mar 24, 2025Updated 10 months ago
- ☆20Feb 10, 2025Updated last year
- AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference☆20Jan 24, 2025Updated last year
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 2 years ago
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Jun 30, 2024Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Feb 21, 2025Updated 11 months ago
- (ICLR25 Oral) Do as We Do, Not as You Think: the Conformity of Large Language Models☆38Feb 6, 2026Updated last week
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆61Jul 16, 2024Updated last year
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- [ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely☆24Jun 26, 2024Updated last year
- [ICLR 2025] See What You Are Told: Visual Attention Sink in Large Multimodal Models☆90Feb 16, 2025Updated 11 months ago
- Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Princ…☆40Jul 18, 2025Updated 6 months ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- [ICML 2025] Official PyTorch implementation of "FlatQuant: Flatness Matters for LLM Quantization"☆211Nov 25, 2025Updated 2 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆32Mar 2, 2025Updated 11 months ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆72Jan 16, 2026Updated 3 weeks ago
- A list of papers about data quality in Large Language Models (LLMs)☆27Dec 14, 2023Updated 2 years ago
- My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated☆33Aug 14, 2024Updated last year
- AbstainQA, ACL 2024☆28Feb 4, 2026Updated last week