Scaling Preference Data Curation via Human-AI Synergy
☆142Jul 3, 2025Updated 8 months ago
Alternatives and similar repositories for Skywork-Reward-V2
Users that are interested in Skywork-Reward-V2 are comparing it to the libraries listed below
Sorting:
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆34Sep 1, 2025Updated 6 months ago
- Supporting code for ReCEval paper☆31Sep 14, 2024Updated last year
- Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Rou…☆34Sep 25, 2025Updated 5 months ago
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆37Nov 26, 2025Updated 3 months ago
- The official implementation of "ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering"☆56Jun 21, 2025Updated 8 months ago
- [NeurIPS 2025 D&B Track] Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"☆41May 22, 2025Updated 9 months ago
- ☆17Aug 5, 2025Updated 6 months ago
- ☆54May 6, 2025Updated 9 months ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 2 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 11 months ago
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning…☆47Aug 4, 2025Updated 7 months ago
- ☆19Dec 20, 2025Updated 2 months ago
- ☆28Sep 4, 2025Updated 6 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆183Jul 23, 2025Updated 7 months ago
- (NeurIPS 2025 🔥) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"☆41Feb 11, 2026Updated 3 weeks ago
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆19Mar 9, 2025Updated 11 months ago
- ☆32Aug 26, 2025Updated 6 months ago
- All-in-one benchmarking platform for evaluating LLM.☆15Nov 12, 2025Updated 3 months ago
- implementation code for 'PLATE: A Prompt-Enhanced Paradigm for Multi-Scenario Recommendations' in SIGIR 2023☆13Sep 27, 2024Updated last year
- 🌟Official code of our AAAI26 paper 🔍WebFilter☆37Nov 9, 2025Updated 3 months ago
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆71Sep 8, 2025Updated 5 months ago
- [arXiv'25] AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance☆41Feb 19, 2025Updated last year
- ☆96Feb 4, 2026Updated last month
- Physics-based rigging with MPM for realistic character animation. ICCV 2025.☆82Feb 21, 2026Updated last week
- Code for paper 'Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning'☆18Apr 19, 2024Updated last year
- ☆66Jan 12, 2026Updated last month
- Manages vllm-nccl dependency☆17Jun 3, 2024Updated last year
- Chinese-native image generation while compatible with SD eco-system, 1st-gen, AAAI2025☆13Jun 25, 2024Updated last year
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆159Jun 26, 2025Updated 8 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆53Dec 13, 2025Updated 2 months ago
- BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models☆40Oct 30, 2025Updated 4 months ago
- Official implementation of CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.☆48Sep 15, 2025Updated 5 months ago
- Defeating the Training-Inference Mismatch via FP16☆183Nov 14, 2025Updated 3 months ago
- ☆27Jul 23, 2025Updated 7 months ago
- Train transformer language models with reinforcement learning.☆19Feb 25, 2025Updated last year
- ☆28Aug 13, 2025Updated 6 months ago
- VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs☆48Jan 5, 2026Updated last month
- ☆33Jul 15, 2025Updated 7 months ago