modelscope / RM-GalleryLinks
A One-Stop Reward Model Platform
☆90Updated this week
Alternatives and similar repositories for RM-Gallery
Users that are interested in RM-Gallery are comparing it to the libraries listed below
Sorting:
- a-m-team's exploration in large language modeling☆192Updated 5 months ago
- ☆142Updated 3 weeks ago
- The related works and background techniques about Openai o1☆221Updated 10 months ago
- A live reading list for LLM data synthesis (Updated to July, 2025).☆408Updated 2 months ago
- ☆146Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆400Updated 4 months ago
- ☆86Updated last year
- ☆382Updated last month
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆282Updated 2 years ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆110Updated 5 months ago
- ☆47Updated 9 months ago
- ☆76Updated last week
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆265Updated 9 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆404Updated last week
- 在verl上做reward的定制开发☆128Updated 6 months ago
- ☆147Updated last year
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆153Updated 10 months ago
- A research repo for experiments about Reinforcement Finetuning☆52Updated 7 months ago
- ☆77Updated 9 months ago
- Fantastic Data Engineering for Large Language Models☆92Updated 10 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆182Updated 4 months ago
- Awesome papers for role-playing with language models☆210Updated last year
- ☆129Updated 6 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆136Updated last year
- ☆274Updated 5 months ago
- Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models☆266Updated last year
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆138Updated last week
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆135Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆97Updated 9 months ago
- ☆162Updated 10 months ago