SkyworkAI / Skywork-Reward-V2Links
Scaling Preference Data Curation via Human-AI Synergy
☆114Updated 3 months ago
Alternatives and similar repositories for Skywork-Reward-V2
Users that are interested in Skywork-Reward-V2 are comparing it to the libraries listed below
Sorting:
- Fantastic Data Engineering for Large Language Models☆90Updated 9 months ago
- ☆91Updated 4 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆68Updated 11 months ago
- ☆169Updated 5 months ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆134Updated 3 months ago
- ☆73Updated 8 months ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆104Updated 4 months ago
- ☆77Updated last month
- ☆106Updated 4 months ago
- ☆131Updated this week
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆157Updated last week
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆171Updated 7 months ago
- ☆39Updated 2 months ago
- ☆96Updated last year
- ☆105Updated 2 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆149Updated 9 months ago
- ☆59Updated 11 months ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆79Updated last week
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆65Updated 4 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆162Updated last month
- The demo, code and data of FollowRAG☆74Updated 3 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆130Updated 5 months ago
- ☆83Updated last year
- ☆49Updated 6 months ago
- ☆154Updated 4 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆324Updated last month
- ☆49Updated last year
- The official repo of "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"☆72Updated this week
- A Comprehensive Survey on Long Context Language Modeling☆189Updated 2 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆262Updated 3 weeks ago