HKUDS / SepLLM
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
☆71Updated 4 months ago
Alternatives and similar repositories for SepLLM:
Users that are interested in SepLLM are comparing it to the libraries listed below
- Code for "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆68Updated 6 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆35Updated 3 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆35Updated 2 months ago
- ☆24Updated last month
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆136Updated last month
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆94Updated 3 weeks ago
- ☆95Updated last month
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆63Updated 2 months ago
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆90Updated 2 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆89Updated 2 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆67Updated 2 months ago
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models☆76Updated 6 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- ☆80Updated 3 weeks ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆94Updated last month
- ☆40Updated this week
- This repository introduce a comprehensive paper list, datasets, methods and tools for memory research.☆26Updated last week
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆69Updated last month
- ☆96Updated this week
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆195Updated last month
- ZeroSearch: Incentivize the Search Capability of LLMs without Searching☆321Updated this week
- [ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"☆82Updated 11 months ago
- ☆22Updated 9 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆56Updated 2 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆42Updated 5 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆75Updated 4 months ago
- ☆144Updated 8 months ago
- ☆78Updated 3 months ago
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆62Updated 2 weeks ago
- ☆132Updated 9 months ago