rlite-project / RLiteLinks
A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithms with minimal intrusion.
☆32Updated last month
Alternatives and similar repositories for RLite
Users that are interested in RLite are comparing it to the libraries listed below
Sorting:
- ☆36Updated this week
- ☆53Updated last week
- Reproducing R1 for Code with Reliable Rewards☆221Updated last month
- ☆104Updated 2 weeks ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆54Updated 3 months ago
- Async pipelined version of Verl☆100Updated 2 months ago
- ☆58Updated this week
- Revisiting Mid-training in the Era of RL Scaling☆62Updated 2 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆28Updated 4 months ago
- ☆33Updated 9 months ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆45Updated 7 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆138Updated 9 months ago
- ☆71Updated 7 months ago
- Resources for the Enigmata Project.☆40Updated 2 weeks ago
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- [NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623☆85Updated 9 months ago
- The code and data for the paper JiuZhang3.0☆47Updated last year
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆17Updated 2 months ago
- ☆85Updated 2 months ago
- ☆33Updated this week
- ARM: Adaptive Reasoning Model☆40Updated last week
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆47Updated last week
- The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…☆55Updated 5 months ago
- The official implementation for Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free☆44Updated last month
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆51Updated 11 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆104Updated 2 months ago
- ☆24Updated last year
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆45Updated this week
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆127Updated this week
- General Reasoner: Advancing LLM Reasoning Across All Domains☆141Updated 2 weeks ago