OpenDataArena / OpenDataArena-ToolLinks
☆70Updated last month
Alternatives and similar repositories for OpenDataArena-Tool
Users that are interested in OpenDataArena-Tool are comparing it to the libraries listed below
Sorting:
- a-m-team's exploration in large language modeling☆189Updated 4 months ago
- ☆358Updated 4 months ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆158Updated 3 weeks ago
- ☆169Updated 5 months ago
- A live reading list for LLM data synthesis (Updated to July, 2025).☆383Updated last month
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆151Updated 9 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆270Updated this week
- Fantastic Data Engineering for Large Language Models☆90Updated 9 months ago
- The related works and background techniques about Openai o1☆222Updated 9 months ago
- 在verl上做reward的定制开发☆118Updated 4 months ago
- ☆297Updated 4 months ago
- ☆126Updated 3 weeks ago
- Collect every awesome work about r1!☆419Updated 5 months ago
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆133Updated 6 months ago
- A Comprehensive Survey on Long Context Language Modeling☆192Updated 3 months ago
- ☆413Updated last week
- ☆160Updated 8 months ago
- Extrapolating RLVR to General Domains without Verifiers☆173Updated 2 months ago
- An Awesome List of Agentic Model trained with Reinforcement Learning☆502Updated this week
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆106Updated 4 months ago
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆263Updated 3 months ago
- A comprehensive collection of process reward models.☆111Updated 2 weeks ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆366Updated this week
- A curated list of awesome works in Routing LLMs paradigm (👉 Welcome to submit your contributions to this code repository)☆65Updated 3 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆258Updated 7 months ago
- R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning☆640Updated 2 months ago
- A series of technical report on Slow Thinking with LLM☆739Updated 2 months ago
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆625Updated 6 months ago
- ☆548Updated 9 months ago
- ☆275Updated 3 months ago