rlite-project / RLiteLinks

A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.

☆33

Alternatives and similar repositories for RLite

Users that are interested in RLite are comparing it to the libraries listed below

Sorting:

ltzheng / SimpleTIR
End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆127Updated this week
Infini-AI-Lab / Multiverse
☆71Updated last week
ISEEKYAN / mbridge
☆50Updated 3 weeks ago
sail-sg / LongSpec
LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
☆57Updated 4 months ago
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆187Updated 3 months ago
OpenSparseLLMs / Linear-MoE
☆110Updated last month
yyht / openrlhf_async_pipline
☆59Updated last month
agentica-project / verl-pipeline
Async pipelined version of Verl
☆106Updated 3 months ago
SkyworkAI / Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
☆135Updated last year
inclusionAI / Ring
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆87Updated 3 weeks ago
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆86Updated 9 months ago
li-plus / flash-preference
Accelerate LLM preference tuning via prefix sharing with a single line of code
☆42Updated last week
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆136Updated this week
mdy666 / Qwen-Native-Sparse-Attention
qwen-nsa
☆68Updated 3 months ago
hyx1999 / SAM-Decoding
Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton
☆28Updated 5 months ago
feifeibear / Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
☆70Updated last year
ganler / code-r1
Reproducing R1 for Code with Reliable Rewards
☆237Updated 2 months ago
pprp / Awesome-Efficient-MoE
Efficient Mixture of Experts for LLM Paper List
☆79Updated 7 months ago
Zanette-Labs / SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆48Updated 8 months ago
GAIR-NLP / PC-Agent-E
Efficient Agent Training for Computer Use
☆114Updated last month
BytedTsinghua-SIA / Enigmata
Resources for the Enigmata Project.
☆53Updated last month
Infini-AI-Lab / gsm_infinite
☆47Updated last month
GuanghaoYe / Emergence-of-Thinking
☆52Updated 5 months ago
TIGER-AI-Lab / AceCoder
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]
☆88Updated 3 months ago
MoonshotAI / Kimi-Researcher
☆63Updated 3 weeks ago
thunlp / Ouroboros
Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)
☆107Updated 3 months ago
rayleizhu / vllm-ra
[ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts
☆40Updated last year
yaof20 / DenseMixer
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
☆35Updated this week
DAMO-NLP-SG / LongPO
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆38Updated 4 months ago
HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆103Updated 2 months ago