ZJU-REAL / EasySteerLinks
A Unified Framework for High-Performance and Extensible LLM Steering
☆42Updated this week
Alternatives and similar repositories for EasySteer
Users that are interested in EasySteer are comparing it to the libraries listed below
Sorting:
- ☆29Updated last month
- ☆36Updated 2 weeks ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆86Updated 7 months ago
- ☆25Updated 3 weeks ago
- ☆37Updated last month
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆53Updated 2 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆44Updated 3 months ago
- Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆40Updated last week
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆26Updated 7 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆22Updated last month
- ☆24Updated 2 weeks ago
- [ICML'25] Official code of paper "Fast Large Language Model Collaborative Decoding via Speculation"☆28Updated 3 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆81Updated 4 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆29Updated 2 months ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆48Updated 4 months ago
- The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"☆19Updated 2 weeks ago
- The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated last month
- ☆21Updated 5 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆80Updated 3 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆30Updated 3 weeks ago
- ☆67Updated 3 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆90Updated 7 months ago
- ☆154Updated 4 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆68Updated 3 months ago
- ☆26Updated 4 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆37Updated last month
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆93Updated 2 weeks ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆104Updated 5 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆62Updated 4 months ago