LongAttn :Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated 7 months ago
Alternatives and similar repositories for LongAttn
Users that are interested in LongAttn are comparing it to the libraries listed below
Sorting:
- Research work aimed at addressing the problem of modeling infinite-length context☆47Dec 18, 2025Updated 2 months ago
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models☆60Jul 23, 2024Updated last year
- ☆21Jul 3, 2025Updated 8 months ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated last week
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆18Apr 1, 2025Updated 11 months ago
- ☆22Oct 22, 2024Updated last year
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆41Oct 11, 2024Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- ☆47Nov 25, 2024Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Jun 26, 2024Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆110Oct 11, 2025Updated 4 months ago
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆72Sep 18, 2025Updated 5 months ago
- ☆62Oct 29, 2024Updated last year
- ☆14Jun 19, 2024Updated last year
- ☆37Oct 16, 2025Updated 4 months ago
- ☆109Jul 15, 2025Updated 7 months ago
- Muon fsdp 2☆55Aug 8, 2025Updated 7 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 5 months ago
- ☆64Apr 9, 2024Updated last year
- ☆38Nov 13, 2025Updated 3 months ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆30Apr 8, 2024Updated last year
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆37May 31, 2025Updated 9 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 9 months ago
- Repository of IPBench☆19Jan 4, 2026Updated 2 months ago
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆39May 28, 2025Updated 9 months ago
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆10Feb 13, 2024Updated 2 years ago
- ☆11Aug 20, 2025Updated 6 months ago
- ☆88Dec 29, 2023Updated 2 years ago
- Anchored Diffusion Language Model (NeurIPS 2025)☆27Oct 13, 2025Updated 4 months ago
- ☆14Aug 28, 2024Updated last year
- Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)☆58Jun 27, 2025Updated 8 months ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆48Mar 7, 2024Updated 2 years ago
- 2018/2019/校招/春招/秋招/算法/机器学习(Machine Learning)/深度学习(Deep Learning)/自然语言处理(NLP)/C/C++/Python/面试笔记☆10Oct 31, 2018Updated 7 years ago
- ☆26Oct 16, 2025Updated 4 months ago
- Ilya Sutskever 推荐的30篇Deep learning 必读论文 (中英文对照翻译版)☆13Dec 18, 2024Updated last year
- Human Evaluation Benchmark for Text Simplification☆10Sep 6, 2018Updated 7 years ago
- Artifacts of VLDB'22 paper "COMET: A Novel Memory-Efficient Deep Learning TrainingFramework by Using Error-Bounded Lossy Compression"☆10Aug 2, 2022Updated 3 years ago