Lyun0912-wu / LongAttnView external linksLinks
LongAttn :Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated 7 months ago
Alternatives and similar repositories for LongAttn
Users that are interested in LongAttn are comparing it to the libraries listed below
Sorting:
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated last month
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models☆59Jul 23, 2024Updated last year
- ☆21Jul 3, 2025Updated 7 months ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 8, 2026Updated last week
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆18Apr 1, 2025Updated 10 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆41Oct 11, 2024Updated last year
- ☆22Oct 22, 2024Updated last year
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- ☆49Nov 25, 2024Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Jun 26, 2024Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆109Oct 11, 2025Updated 4 months ago
- ☆62Oct 29, 2024Updated last year
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆70Sep 18, 2025Updated 4 months ago
- ☆36Oct 16, 2025Updated 4 months ago
- ☆109Jul 15, 2025Updated 7 months ago
- Muon fsdp 2☆53Aug 8, 2025Updated 6 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 4 months ago
- ☆64Apr 9, 2024Updated last year
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- ☆38Nov 13, 2025Updated 3 months ago
- Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …☆37May 31, 2025Updated 8 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 8 months ago
- Repository of IPBench☆19Jan 4, 2026Updated last month
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆39May 28, 2025Updated 8 months ago
- ☆87Dec 29, 2023Updated 2 years ago
- Anchored Diffusion Language Model (NeurIPS 2025)☆27Oct 13, 2025Updated 4 months ago
- ☆13Aug 28, 2024Updated last year
- 在监控画质下实现对校园自行车的重识别,包含REID模型识别,向量数据库检索,UI展示☆10Feb 13, 2024Updated 2 years ago
- ☆11Aug 20, 2025Updated 5 months ago
- Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)☆57Jun 27, 2025Updated 7 months ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆48Mar 7, 2024Updated last year
- Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)☆12Oct 11, 2023Updated 2 years ago
- ☆16Apr 30, 2025Updated 9 months ago
- Using machine learning techniques for prediction and modelling non linear dynamic systems.☆10Jun 29, 2018Updated 7 years ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆46Aug 13, 2025Updated 6 months ago
- ☆10Dec 10, 2024Updated last year
- Repository to go along with the paper "Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines"☆10Mar 31, 2022Updated 3 years ago