modelscope / RM-GalleryLinks

A One-Stop Reward Model Platform

☆101

Alternatives and similar repositories for RM-Gallery

Users that are interested in RM-Gallery are comparing it to the libraries listed below

Sorting:

a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆194Updated 6 months ago
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆221Updated 11 months ago
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆432Updated this week
Mryangkaitong / deepseek-r1-gsm8k
☆47Updated 10 months ago
qiancheng0 / ToolRL
☆392Updated last month
pengr / LLM-Synthetic-Data
A live reading list for LLM data synthesis (Updated to July, 2025).
☆420Updated 3 months ago
chenchen0103 / ACEBench
☆147Updated last month
tianyi-lab / Cherry_LLM
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…
☆409Updated 5 months ago
OFA-Sys / InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
☆284Updated 2 years ago
yuanzhoulvpi2017 / nano_rl
在verl上做reward的定制开发
☆132Updated 6 months ago
CASIA-LM / MoDS
☆146Updated last year
RUC-NLPIR / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆293Updated last month
morecry / CharacterEval
☆277Updated 6 months ago
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆272Updated 9 months ago
nuochenpku / Awesome-Role-Play-Papers
Awesome papers for role-playing with language models
☆213Updated last year
bytarnish / AGILE
☆162Updated 10 months ago
LivingFutureLab / ChineseSimpleQA
☆77Updated 10 months ago
pldlgb / nuggets
☆87Updated last year
sail-sg / sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
☆138Updated last year
IronBeliever / CaR
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation
☆90Updated last year
ADaM-BJTU / OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆154Updated 11 months ago
YunjiaXi / Awesome-Search-Agent-Papers
☆77Updated 2 weeks ago
QwenLM / AutoIF
☆317Updated last year
thu-coai / CritiqueLLM
☆147Updated last year
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆184Updated 5 months ago
kongjiellx / octupus-tool-call
☆64Updated 7 months ago
THUNLP-MT / StableToolBench
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
☆200Updated 7 months ago
hscspring / rl-llm-nlp
Reinforcement Learning in LLM and NLP.
☆61Updated 3 months ago
RUCAIBox / SimpleDeepSearcher
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
☆111Updated 6 months ago
CASIA-LM / ChineseWebText
☆181Updated 2 years ago