zhiyuanhubj/LongRecipe

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhiyuanhubj/LongRecipe)

zhiyuanhubj / LongRecipe

LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models

☆79

Alternatives and similar repositories for LongRecipe

Users that are interested in LongRecipe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

princeton-nlp / ProLong
View on GitHub
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
☆261Sep 12, 2025Updated 10 months ago
zhiyuanhubj / UoT
View on GitHub
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
☆107Aug 5, 2024Updated last year
Lux0926 / ASPRM
View on GitHub
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
☆10Mar 2, 2025Updated last year
zhiyuanhubj / Meta-Ability-Alignment
View on GitHub
Official code of paper "Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models"
☆88May 27, 2025Updated last year
sheryc / resonance_rope
View on GitHub
[ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.
☆24Mar 5, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Leooyii / LCEG
View on GitHub
[COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs
☆65Mar 9, 2026Updated 4 months ago
F2-Song / ICDPO
View on GitHub
The official implementation of "ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization…
☆16Feb 15, 2024Updated 2 years ago
zhiyuanhubj / Long_form_VideoQA
View on GitHub
[EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering
☆18Oct 9, 2024Updated last year
dwzhu-pku / PoSE
View on GitHub
Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)
☆208May 20, 2024Updated 2 years ago
chuanyang-Zheng / DAPE
View on GitHub
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆41Oct 11, 2024Updated last year
duyngtr16061999 / KDMCSE
View on GitHub
☆10Apr 7, 2024Updated 2 years ago
FranxYao / Long-Context-Data-Engineering
View on GitHub
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
☆503Mar 19, 2024Updated 2 years ago
princeton-nlp / HELMET
View on GitHub
The HELMET Benchmark
☆221Apr 17, 2026Updated 3 months ago
microsoft / LongRoPE
View on GitHub
LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.
☆290Oct 28, 2025Updated 9 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
nightdessert / Retrieval_Head
View on GitHub
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
☆241Aug 2, 2024Updated last year
October2001 / ProLong
View on GitHub
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆61Jul 23, 2024Updated 2 years ago
HKUNLP / ChunkLlama
View on GitHub
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
☆451Oct 16, 2024Updated last year
FranxYao / Retrieval-Head-with-Flash-Attention
View on GitHub
Efficient retrieval head analysis with triton flash attention that supports topK probability
☆13Jun 15, 2024Updated 2 years ago
bigai-nlco / LooGLE
View on GitHub
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
☆199Oct 8, 2024Updated last year
whyNLP / LCKV
View on GitHub
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…
☆157Apr 7, 2025Updated last year
TIGER-AI-Lab / LongICLBench
View on GitHub
Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]
☆113Feb 20, 2025Updated last year
yf-he / UniGraph
View on GitHub
UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs (KDD'25)
☆31Jun 6, 2025Updated last year
PKU-ML / LongPPL
View on GitHub
Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"
☆116Oct 11, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
megagonlabs / holobench
View on GitHub
🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…
☆12Feb 25, 2025Updated last year
richardodliu / OpenCodeEval
View on GitHub
☆52Mar 9, 2026Updated 4 months ago
DAMO-NLP-SG / CLEX
View on GitHub
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models
☆78Mar 12, 2024Updated 2 years ago
allenai / easy-to-hard-generalization
View on GitHub
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Jan 17, 2024Updated 2 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆36Feb 26, 2026Updated 5 months ago
tml-epfl / icl-alignment
View on GitHub
Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]
☆33Jan 23, 2025Updated last year
kiaia / GIRAFFE
View on GitHub
Extending context length of visual language models
☆12Dec 18, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Tiiiger / templm
View on GitHub
Code release for "TempLM: Distilling Language Models into Template-Based Generators"
☆14Jul 21, 2022Updated 4 years ago
leloykun / flash-attention-minimal
View on GitHub
Flash Attention in 300-500 lines of CUDA/C++
☆39Aug 22, 2025Updated 11 months ago
jzhang38 / LongMamba
View on GitHub
Some preliminary explorations of Mamba's context scaling.
☆221Feb 8, 2024Updated 2 years ago
chenllliang / MMEvalPro
View on GitHub
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆25Sep 26, 2024Updated last year
LCM-Lab / L-CITEEVAL
View on GitHub
Evaluating the faithfulness of long-context language models
☆30Oct 21, 2024Updated last year
haonan3 / AnchorContext
View on GitHub
AnchorAttention: Improved attention for LLMs long-context training
☆216Jan 15, 2025Updated last year
NVIDIA / RULER
View on GitHub
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
☆1,589Jul 22, 2026Updated last week