SkyworkAI/Skywork-Reward-V2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SkyworkAI/Skywork-Reward-V2)

SkyworkAI / Skywork-Reward-V2

Scaling Preference Data Curation via Human-AI Synergy

☆152

Alternatives and similar repositories for Skywork-Reward-V2

Users that are interested in Skywork-Reward-V2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

InternLM / POLAR
View on GitHub
Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.
☆166Sep 23, 2025Updated 10 months ago
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
vl-rewardbench / VL_RewardBench
View on GitHub
☆29Jul 23, 2025Updated last year
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
BytedanceDouyinContent / SAIL-VL2
View on GitHub
The SAIL-VL2 series model developed by the BytedanceDouyinContent Group
☆79Sep 18, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
TIGER-AI-Lab / One-Shot-CFT
View on GitHub
The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]
☆33Sep 1, 2025Updated 10 months ago
Tencent-Hunyuan / Hunyuan-4B
View on GitHub
☆16Aug 5, 2025Updated 11 months ago
agentscope-ai / Trinity-RFT
View on GitHub
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆673Jul 21, 2026Updated last week
HongbangYuan / OmniReward
View on GitHub
☆47Dec 16, 2025Updated 7 months ago
LeapLabTHU / limit-of-RLVR
View on GitHub
repo for paper https://arxiv.org/abs/2504.13837
☆346Dec 17, 2025Updated 7 months ago
CodeGoat24 / Pref-GRPO
View on GitHub
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
☆275Feb 10, 2026Updated 5 months ago
multimodal-art-projection / REER_DeepWriter
View on GitHub
REverse-Engineered Reasoning for Open-Ended Generation
☆98Sep 10, 2025Updated 10 months ago
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
RamyaLab / pluralistic-alignment
View on GitHub
The open-source repository for PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment, which provides a general per…
☆17Aug 28, 2025Updated 11 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
SkyworkAI / Skywork-OR1
View on GitHub
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
☆740Jun 6, 2025Updated last year
IANNXANG / RuscaRL
View on GitHub
☆48Jan 30, 2026Updated 5 months ago
QwenLM / ParScale
View on GitHub
Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling
☆480May 17, 2025Updated last year
sauc-abadal / ALT
View on GitHub
Official repository for ALT (ALignment with Textual feedback).
☆10Jul 25, 2024Updated 2 years ago
CodeGoat24 / UnifiedReward
View on GitHub
Official implementation of UnifiedReward & [NeurIPS 2025] UnifiedReward-Think & UnifiedReward-Flex
☆796Jun 18, 2026Updated last month
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
archiki / ReCEval
View on GitHub
Supporting code for ReCEval paper
☆32Sep 14, 2024Updated last year
Qwen-Applications / DIR
View on GitHub
☆17Feb 14, 2026Updated 5 months ago
THUDM / slime
View on GitHub
slime is an LLM post-training framework for RL Scaling.
☆7,679Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / AdvancedIF
View on GitHub
This is the github to open source benchmark AdvancedIF, see LAMA L1387358RCRO
☆37Nov 26, 2025Updated 8 months ago
allenai / olmix
View on GitHub
☆41May 26, 2026Updated 2 months ago
HumanMLLM / HumanOmniV2
View on GitHub
☆161Jul 31, 2025Updated 11 months ago
THU-KEG / R-Eval
View on GitHub
[KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models
☆11Apr 9, 2024Updated 2 years ago
sail-sg / Precision-RL
View on GitHub
Defeating the Training-Inference Mismatch via FP16
☆197Nov 14, 2025Updated 8 months ago
alibaba / ROLL
View on GitHub
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
☆3,333Updated this week
RM-R1-UIUC / RM-R1
View on GitHub
[ICLR'26] RM-R1: Unleashing the Reasoning Potential of Reward Models
☆167Jun 26, 2025Updated last year
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆229Nov 27, 2025Updated 8 months ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,855Jul 14, 2026Updated 2 weeks ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
lime-RL / DCPO
View on GitHub
DCPO: Dynamic Adaptive Clipping for RL
☆49Apr 1, 2026Updated 3 months ago
yongliang-wu / DFT
View on GitHub
[ICLR 2026] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
☆588Jan 4, 2026Updated 6 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,083Updated this week
TIGER-AI-Lab / verl-tool
View on GitHub
A version of verl to support diverse tool use [TMLR 2026]
☆1,026Jul 15, 2026Updated last week
IAAR-Shanghai / xVerify
View on GitHub
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆149Nov 13, 2025Updated 8 months ago
Ayanami0730 / deep_research_bench
View on GitHub
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
☆803May 11, 2026Updated 2 months ago
zwhong714 / PSFT
View on GitHub
[ICLR 2026] PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, co…
☆39Sep 9, 2025Updated 10 months ago