thunlp / APBLinks
Official Implementation of APB (ACL 2025 main Oral) and Spava.
☆32Updated last week
Alternatives and similar repositories for APB
Users that are interested in APB are comparing it to the libraries listed below
Sorting:
- ☆63Updated 7 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Updated last month
- ☆19Updated last year
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Updated last month
- ☆100Updated 6 months ago
- ☆85Updated 2 months ago
- ☆71Updated last year
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆52Updated last year
- Codebase for Instruction Following without Instruction Tuning☆36Updated last year
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Updated last year
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆56Updated last year
- [NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models☆60Updated last year
- Research work aimed at addressing the problem of modeling infinite-length context☆45Updated last month
- [EMNLP 2023]Context Compression for Auto-regressive Transformers with Sentinel Tokens☆25Updated 2 years ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆121Updated 8 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34Updated 8 months ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆40Updated last year
- Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.☆106Updated 6 months ago
- ☆82Updated 10 months ago
- ☆110Updated 4 months ago
- DPO, but faster 🚀☆47Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆58Updated last week
- ☆21Updated 9 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Updated 3 months ago
- ☆129Updated 8 months ago
- [ICML 2025] From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories and Applications☆52Updated 3 months ago
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆44Updated last year
- ☆78Updated 7 months ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Updated last year
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆63Updated 7 months ago