Scaling Preference Data Curation via Human-AI Synergy
☆145Jul 3, 2025Updated 8 months ago
Alternatives and similar repositories for Skywork-Reward-V2
Users that are interested in Skywork-Reward-V2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Supporting code for ReCEval paper☆31Sep 14, 2024Updated last year
- The official repo for “Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem” [EMNLP25]☆34Sep 1, 2025Updated 6 months ago
- ☆27Jul 23, 2025Updated 8 months ago
- ☆29Sep 4, 2025Updated 6 months ago
- ☆17Aug 5, 2025Updated 7 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆20Mar 9, 2025Updated last year
- [NeurIPS 2025 D&B Track] Evaluation Code Repo for Paper "PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts"☆43May 22, 2025Updated 10 months ago
- ☆21Jul 24, 2025Updated 8 months ago
- Multimodal RewardBench☆64Feb 21, 2025Updated last year
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆39Nov 26, 2025Updated 4 months ago
- The official implementation of "ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning Engineering"☆58Jun 21, 2025Updated 9 months ago
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 3 months ago
- (CVPR 26 Findings) Official implementation of the paper "Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-…☆34Sep 25, 2025Updated 6 months ago
- All-in-one benchmarking platform for evaluating LLM.☆15Nov 12, 2025Updated 4 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The raw UserRL repo under construction☆97Sep 25, 2025Updated 6 months ago
- ☆13Mar 28, 2025Updated 11 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆186Jul 23, 2025Updated 8 months ago
- ☆54May 6, 2025Updated 10 months ago
- Official implementation for our paper: Rethinking Video Tokenization: A Conditioned Diffusion-based Approach☆14Apr 2, 2025Updated 11 months ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆42Apr 7, 2024Updated last year
- Train transformer language models with reinforcement learning.☆19Feb 25, 2025Updated last year
- Unleashing the Power of Reinforcement Learning for Math and Code Reasoners☆743Jun 6, 2025Updated 9 months ago
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents☆642Mar 20, 2026Updated last week
- Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".☆74Jun 23, 2025Updated 9 months ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆164Sep 23, 2025Updated 6 months ago
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆111Sep 29, 2025Updated 5 months ago
- ☆811Jun 9, 2025Updated 9 months ago
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆43Oct 31, 2025Updated 4 months ago
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆23Nov 1, 2025Updated 4 months ago
- 通过爬取小红书热门笔记评论,将得到的评论内容进行数据清洗、预处理、标注(2分类标注)等工作。☆13Jun 11, 2025Updated 9 months ago
- The code and dataset for "FastRE: Towards Fast Relation Extraction with Convolutional Encoder and Improved Cascade Binary Tagging Framewo…☆24Aug 13, 2022Updated 3 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Async pipelined version of Verl☆124Apr 8, 2025Updated 11 months ago
- A comprehensive Model Context Protocol (MCP) server for market sizing analysis, TAM/SAM calculations, and industry research. Built with T…☆29Jun 22, 2025Updated 9 months ago
- The source code and manually annotated datasets for our paper "Joint Multimodal Sentiment Analysis Based on Information Relevance"☆11Dec 17, 2022Updated 3 years ago
- ☆28Aug 13, 2025Updated 7 months ago
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆79Jul 18, 2025Updated 8 months ago
- [CVPR 2025] Parallel Sequence Modeling via Generalized Spatial Propagation Network☆111Jul 18, 2025Updated 8 months ago
- ☆13May 29, 2024Updated last year