Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
☆115Updated 2 weeks ago
Related projects: ⓘ
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆133Updated 3 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting☆60Updated 6 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆148Updated 6 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search☆91Updated 3 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆135Updated last month
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆244Updated last week
- Official github repo for the paper "Compression Represents Intelligence Linearly"☆121Updated 3 months ago
- Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)☆106Updated this week
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆72Updated 4 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆87Updated 2 months ago
- ☆164Updated 4 months ago
- LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation☆194Updated 4 months ago
- Repository of LV-Eval Benchmark☆41Updated 2 weeks ago
- [ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models☆72Updated 6 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆114Updated 2 months ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆158Updated 4 months ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆195Updated 3 months ago
- Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)☆56Updated 7 months ago
- Official implementation for the paper *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆57Updated 3 weeks ago
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆134Updated 2 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆208Updated last week
- ☆99Updated last year
- Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"☆123Updated 2 months ago
- ☆75Updated this week
- ☆104Updated last month
- Explorations into some recent techniques surrounding speculative decoding☆190Updated 11 months ago
- Implementation of Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting☆39Updated 2 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆101Updated last week
- ☆87Updated 4 months ago
- ☆82Updated 5 months ago