Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆128Updated last month
Related projects ⓘ
Alternatives and complementary repositories for LM-Infinite
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆167Updated last month
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆76Updated last month
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆144Updated 5 months ago
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆285Updated last month
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆176Updated last month
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆160Updated 3 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆91Updated 4 months ago
- ☆89Updated 7 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated 2 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆118Updated 4 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆138Updated 2 months ago
- Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)☆58Updated 9 months ago
- ☆188Updated 6 months ago
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆217Updated 6 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆118Updated 3 weeks ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆199Updated 6 months ago
- Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"☆64Updated last week
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆138Updated 5 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆219Updated 2 months ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆73Updated 8 months ago
- ☆88Updated last month
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆134Updated 5 months ago
- Repository of LV-Eval Benchmark☆48Updated 2 months ago
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆147Updated 4 months ago
- A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to…☆49Updated last year
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆64Updated 5 months ago
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆78Updated this week
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆69Updated last month
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆191Updated last month
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)☆135Updated last month