modelscope / RM-GalleryLinks
A One-Stop Reward Model Platform
☆101Updated this week
Alternatives and similar repositories for RM-Gallery
Users that are interested in RM-Gallery are comparing it to the libraries listed below
Sorting:
- a-m-team's exploration in large language modeling☆194Updated 6 months ago
- The related works and background techniques about Openai o1☆221Updated 11 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆432Updated this week
- ☆47Updated 10 months ago
- ☆392Updated last month
- A live reading list for LLM data synthesis (Updated to July, 2025).☆420Updated 3 months ago
- ☆147Updated last month
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆409Updated 5 months ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆284Updated 2 years ago
- 在verl上做reward的定制开发☆132Updated 6 months ago
- ☆146Updated last year
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆293Updated last month
- ☆277Updated 6 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆272Updated 9 months ago
- Awesome papers for role-playing with language models☆213Updated last year
- ☆162Updated 10 months ago
- ☆77Updated 10 months ago
- ☆87Updated last year
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆138Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆90Updated last year
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆154Updated 11 months ago
- ☆77Updated 2 weeks ago
- ☆317Updated last year
- ☆147Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆184Updated 5 months ago
- ☆64Updated 7 months ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.☆200Updated 7 months ago
- Reinforcement Learning in LLM and NLP.☆61Updated 3 months ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆111Updated 6 months ago
- ☆181Updated 2 years ago