princeton-pli / PruLongLinks

Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"

☆46

Alternatives and similar repositories for PruLong

Users that are interested in PruLong are comparing it to the libraries listed below

Sorting:

Zanette-Labs / SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆52Updated 11 months ago
Leooyii / LCEG
Long Context Extension and Generalization in LLMs
☆62Updated last year
Shwai-He / MEO
The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":
☆38Updated last year
PKU-ML / LongPPL
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
☆102Updated last week
sail-sg / SimLayerKV
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
☆49Updated last year
sail-sg / LongSpec
LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆65Updated 3 months ago
qiuzh20 / gated_attention
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…
☆95Updated last month
OpenSparseLLMs / Linear-MoE
☆119Updated 4 months ago
princeton-nlp / ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
☆231Updated last month
Kwai-Klear / RLEP
RL with Experience Replay
☆47Updated 2 months ago
Infini-AI-Lab / Kinetics
Kinetics: Rethinking Test-Time Scaling Laws
☆81Updated 3 months ago
princeton-pli / LongProc
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆30Updated last week
HKUNLP / STRING
[ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
☆78Updated 10 months ago
thu-wyz / inference_scaling
☆74Updated 11 months ago
hkust-nlp / dart-math
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆115Updated 10 months ago
RZFan525 / Awesome-ScalingLaws
A curated list of awesome resources dedicated to Scaling Laws for LLMs
☆79Updated 2 years ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆114Updated 5 months ago
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆142Updated last year
thunlp / Ouroboros
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆110Updated 7 months ago
zjunlp / LightThinker
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆112Updated 6 months ago
namespace-Pt / UltraGist
☆18Updated 10 months ago
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆88Updated last year
yaof20 / DenseMixer
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
☆58Updated 2 months ago
princeton-nlp / HELMET
The HELMET Benchmark
☆178Updated 2 months ago
princeton-nlp / CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
☆164Updated last year
nightdessert / Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
☆215Updated last year
Infini-AI-Lab / gsm_infinite
☆55Updated 4 months ago
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆55Updated 8 months ago
yyDing1 / ScaleQuest
[ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
☆68Updated 11 months ago
CodeCreator / WebOrganizer
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
☆67Updated 5 months ago