rlite-project / RLiteLinks
A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.
☆33Updated last month
Alternatives and similar repositories for RLite
Users that are interested in RLite are comparing it to the libraries listed below
Sorting:
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆127Updated this week
- ☆71Updated last week
- ☆50Updated 3 weeks ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆57Updated 4 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆187Updated 3 months ago
- ☆110Updated last month
- ☆59Updated last month
- Async pipelined version of Verl☆106Updated 3 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆135Updated last year
- Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.☆87Updated 3 weeks ago
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆86Updated 9 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆42Updated last week
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆136Updated this week
- qwen-nsa☆68Updated 3 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆28Updated 5 months ago
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- Reproducing R1 for Code with Reliable Rewards☆237Updated 2 months ago
- Efficient Mixture of Experts for LLM Paper List☆79Updated 7 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆48Updated 8 months ago
- Efficient Agent Training for Computer Use☆114Updated last month
- Resources for the Enigmata Project.☆53Updated last month
- ☆47Updated last month
- ☆52Updated 5 months ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 3 months ago
- ☆63Updated 3 weeks ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆107Updated 3 months ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆40Updated last year
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆35Updated this week
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆38Updated 4 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆103Updated 2 months ago