PiotrNawrot / sparse-frontierView external linksLinks
The evaluation framework for training-free sparse attention in LLMs
☆117Jan 27, 2026Updated 2 weeks ago
Alternatives and similar repositories for sparse-frontier
Users that are interested in sparse-frontier are comparing it to the libraries listed below
Sorting:
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.☆92Jul 17, 2025Updated 6 months ago
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated last month
- [ICML 2025] XAttention: Block Sparse Attention with Antidiagonal Scoring☆269Jul 6, 2025Updated 7 months ago
- ☆131May 29, 2025Updated 8 months ago
- CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval☆23Jun 28, 2025Updated 7 months ago
- PyTorch implementation of the Flash Spectral Transform Unit.☆21Sep 19, 2024Updated last year
- ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression (DAC'25)☆23Sep 15, 2025Updated 4 months ago
- This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net…☆16Jul 19, 2024Updated last year
- Code and data for paper "(How) do Language Models Track State?"☆21Mar 31, 2025Updated 10 months ago
- Official Code Repository for the paper "Key-value memory in the brain"☆31Feb 25, 2025Updated 11 months ago
- KV cache compression for high-throughput LLM inference☆153Feb 5, 2025Updated last year
- Fork of Flame repo for training of some new stuff in development☆19Jan 5, 2026Updated last month
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆469May 17, 2025Updated 8 months ago
- [ICLR 2025 & COLM 2025] Official PyTorch implementation of the Forgetting Transformer and Adaptive Computation Pruning☆137Dec 19, 2025Updated last month
- The Official Implementation of Ada-KV [NeurIPS 2025]☆126Nov 26, 2025Updated 2 months ago
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆160Oct 13, 2025Updated 4 months ago
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆84Jan 12, 2025Updated last year
- Landing repository for the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax"☆87Sep 12, 2025Updated 5 months ago
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆56Nov 11, 2025Updated 3 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- RWKV-X is a Linear Complexity Hybrid Language Model based on the RWKV architecture, integrating Sparse Attention to improve the model's l…☆54Jan 12, 2026Updated last month
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆29Jul 24, 2025Updated 6 months ago
- [ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference☆283May 1, 2025Updated 9 months ago
- [ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference☆56Nov 20, 2024Updated last year
- Official implementation of the paper "Pretraining Language Models to Ponder in Continuous Space"☆24Jul 21, 2025Updated 6 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆109Oct 11, 2025Updated 4 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆52Jul 3, 2024Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Jun 6, 2024Updated last year
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 6 months ago
- Unofficial implementation of paper : Exploring the Space of Key-Value-Query Models with Intention☆12May 24, 2023Updated 2 years ago
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 3 months ago
- A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …☆11Mar 18, 2023Updated 2 years ago
- MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)☆11Apr 18, 2025Updated 9 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆371Dec 12, 2024Updated last year
- Experiments on the impact of depth in transformers and SSMs.☆40Oct 23, 2025Updated 3 months ago
- 🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"☆964Feb 5, 2026Updated last week
- [ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads☆524Feb 10, 2025Updated last year
- [ACL 2025 Main] Repository for the paper: 500xCompressor: Generalized Prompt Compression for Large Language Models☆56Jun 11, 2025Updated 8 months ago
- Scalable long-context LLM decoding that leverages sparsity—by treating the KV cache as a vector storage system.☆122Jan 1, 2026Updated last month