HKUDS / SepLLM
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
☆69Updated 3 months ago
Alternatives and similar repositories for SepLLM:
Users that are interested in SepLLM are comparing it to the libraries listed below
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆61Updated 6 months ago
- ☆16Updated last week
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆62Updated 2 months ago
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated last month
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆120Updated last month
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 2 months ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆74Updated 6 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆86Updated 10 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆75Updated 3 months ago
- Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?☆95Updated 5 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆90Updated last week
- AnchorAttention: Improved attention for LLMs long-context training☆206Updated 3 months ago
- ☆75Updated 3 weeks ago
- ☆22Updated 9 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆114Updated 3 weeks ago
- ☆144Updated 7 months ago
- ☆17Updated 3 months ago
- ☆91Updated last month
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆41Updated last month
- The code of RouterDC☆57Updated this week
- ☆20Updated last month
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆40Updated 4 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆68Updated 3 weeks ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆180Updated last month
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆155Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆64Updated 6 months ago
- ☆20Updated last month
- ☆39Updated 3 weeks ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆31Updated last month
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year