[ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concentrated in low-frequency dimensions across different attention heads exclusively in attention queries (Q) and keys (K) while absent in values (V).
☆86Jun 20, 2025Updated 11 months ago
Alternatives and similar repositories for Rope_with_LLM
Users that are interested in Rope_with_LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICONIP'24]Mingyu.Jin's final year project☆30Aug 23, 2024Updated last year
- [EACL'26] DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router☆105Jan 4, 2026Updated 4 months ago
- [ICLR 2025] Official implementation of paper "Dynamic Low-Rank Sparse Adaptation for Large Language Models".☆25Mar 16, 2025Updated last year
- [COLM'24] We propose Protein Chain of Thought (ProCoT), which replicates the biological mechanism of signaling pathways as language promp…☆73Nov 23, 2025Updated 6 months ago
- VAEGAN, I Love u☆16Aug 15, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆23May 17, 2025Updated last year
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- Confidence Regulation Neurons in Language Models (NeurIPS 2024)☆15Feb 1, 2025Updated last year
- [ACL 2025] The official code for "AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection".☆39Aug 4, 2025Updated 9 months ago
- [ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…☆47May 11, 2025Updated last year
- [ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization☆114Jun 2, 2025Updated 11 months ago
- [FCS'24] LVLM Safety paper☆19Jan 4, 2025Updated last year
- The code for "AttentionPredictor: Temporal Pattern Matters for Efficient LLM Inference", Qingyue Yang, Jie Wang, Xing Li, Zhihai Wang, Ch…☆28Jul 15, 2025Updated 10 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆103Nov 9, 2024Updated last year
- ☆23May 8, 2025Updated last year
- [CVPR 2026] Boosting Reasoning in Large Multimodal Models via Activation Replay☆22May 7, 2026Updated 3 weeks ago
- [NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention☆66Aug 30, 2025Updated 9 months ago
- Github Repo for OATS: Outlier-Aware Pruning through Sparse and Low Rank Decomposition☆20Apr 16, 2025Updated last year
- Source code for the ACL'2025 paper titled "Unveiling privacy risks in llm agent memory"☆30Dec 2, 2025Updated 5 months ago
- ☆27May 12, 2026Updated 2 weeks ago
- ☆138Aug 11, 2025Updated 9 months ago
- The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"☆39Oct 1, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆22Jun 5, 2025Updated 11 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆103Jan 26, 2026Updated 4 months ago
- ☆21Feb 10, 2025Updated last year
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆45Jun 30, 2024Updated last year
- ☆18Apr 7, 2025Updated last year
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- ☆12Sep 8, 2023Updated 2 years ago
- ☆11Oct 2, 2024Updated last year
- A curated list of materials on AI guardrails☆53Jun 3, 2025Updated 11 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆42Mar 13, 2023Updated 3 years ago
- [ICLR 2025 Oral] Official Implementation for "Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference Un…☆21Oct 24, 2024Updated last year
- Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Princ…☆42Jul 18, 2025Updated 10 months ago
- ☆22Oct 10, 2025Updated 7 months ago
- [NDSS 2026] Official repo for Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography☆36Mar 14, 2026Updated 2 months ago
- ✨✨ Official repo for "Comparative Analysis of Demonstration Selection Algorithms for LLM In-Context Learning"☆16Nov 8, 2024Updated last year
- [CVPR 2025] Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention☆67Jul 16, 2024Updated last year