smiles724 / Awesome-LLM-RLVRLinks
Collection of latest papers and materials in the area of RLVR!
☆28Updated 4 months ago
Alternatives and similar repositories for Awesome-LLM-RLVR
Users that are interested in Awesome-LLM-RLVR are comparing it to the libraries listed below
Sorting:
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆96Updated 9 months ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆88Updated 11 months ago
- exploring whether LLMs perform case-based or rule-based reasoning☆29Updated last year
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆54Updated 4 months ago
- ☆129Updated 7 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆30Updated 3 months ago
- MMSci: A Multimodal Multi-Discipline Dataset for PhD-Level Scientific Comprehension☆46Updated 10 months ago
- VeriGUI: Verifiable Long-Chain GUI Dataset☆81Updated last week
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"☆111Updated last month
- JudgeLRM: Large Reasoning Models as a Judge☆39Updated last month
- PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes [EMNLP 2024]☆28Updated 11 months ago
- A Sober Look at Language Model Reasoning☆84Updated last week
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆64Updated 7 months ago
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning…☆40Updated 2 months ago
- ☆17Updated 11 months ago
- Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]☆64Updated 9 months ago
- [NeurIPS25 Spotlight] EMPO, A Fully Unsupervised RLVR Method☆68Updated this week
- A curated list of papers on LLMs and agents for scientific research and development☆75Updated 10 months ago
- [NeurIPS'25] Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆48Updated last month
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆42Updated 6 months ago
- Structured Chemistry Reasoning with Large Language Models☆38Updated last year
- ☆25Updated 6 months ago
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…☆63Updated last week
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆177Updated 2 months ago
- A collection of resources and papers on AI Scientist / Robot Scientist☆101Updated 2 weeks ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆26Updated last week
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆82Updated 6 months ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆16Updated 6 months ago
- [EMNLP 2025] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward☆49Updated 2 months ago
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆112Updated last month