lsdefine / lsrlLinks
Low ReSource Reinforcement Learning with CPU Offloading Training Support
โ81Updated last month
Alternatives and similar repositories for lsrl
Users that are interested in lsrl are comparing it to the libraries listed below
Sorting:
- [TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Modelsโ731Updated 3 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ273Updated last year
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It containsโฆโ258Updated 5 months ago
- The related works and background techniques about Openai o1โ220Updated last year
- ๐ A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyondโ342Updated 2 weeks ago
- โ333Updated 8 months ago
- โ48Updated 11 months ago
- โ305Updated 7 months ago
- [EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMsโ200Updated 2 months ago
- A series of technical report on Slow Thinking with LLMโ759Updated 5 months ago
- ๐ฅ How to efficiently and effectively compress the CoTs or directly generate concise CoTs during inference while maintaining the reasoninโฆโ64Updated 8 months ago
- โ57Updated 8 months ago
- LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignmentโ395Updated last year
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!โ72Updated 10 months ago
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"โ390Updated last year
- A version of verl to support diverse tool useโ860Updated last month
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"โ153Updated 3 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"โ413Updated 4 months ago
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuningโ512Updated last year
- A live reading list for LLM data synthesis (Updated to July, 2025).โ449Updated 5 months ago
- โ73Updated 9 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.โ274Updated last week
- a-m-team's exploration in large language modelingโ195Updated 8 months ago
- Paper list for Efficient Reasoning.โ813Updated last week
- llm & rlโ271Updated 3 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learningโ260Updated 8 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.โ419Updated 6 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)โ691Updated last year
- โ214Updated 11 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluationsโ143Updated 2 months ago