xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
โ144Nov 13, 2025Updated 3 months ago
Alternatives and similar repositories for xVerify
Users that are interested in xVerify are comparing it to the libraries listed below
Sorting:
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ121Dec 10, 2024Updated last year
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]โ223Nov 27, 2025Updated 3 months ago
- โ21Jul 9, 2025Updated 7 months ago
- โ19Mar 10, 2025Updated 11 months ago
- PGRAGโ52Jul 16, 2024Updated last year
- A light-weight tool for evaluating LLMs in rule-based ways.โ85Jun 19, 2025Updated 8 months ago
- [EMNLP 2024 Main] Official implementation of the paper "The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Languaโฆโ13Nov 11, 2024Updated last year
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectโฆโ134Jan 31, 2026Updated last month
- โ331May 31, 2025Updated 9 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Modelsโ72Feb 25, 2025Updated last year
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domainsโ50Feb 4, 2026Updated last month
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?โ37Jun 5, 2025Updated 9 months ago
- โ1,104Jan 10, 2026Updated last month
- Feeling confused about super alignment? Here is a reading listโ44Jan 9, 2024Updated 2 years ago
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approachโ32Nov 6, 2023Updated 2 years ago
- [ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMโฆโ68Oct 27, 2024Updated last year
- Codes for Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback (ACL 2024 Findings)โ16Jul 2, 2024Updated last year
- Code for the paper "Self-Detoxifying Language Models via Toxification Reversal" (EMNLP 2023)โ18Oct 17, 2023Updated 2 years ago
- A Sober Look at Language Model Reasoningโ93Nov 18, 2025Updated 3 months ago
- โ27Aug 27, 2025Updated 6 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".โ20Feb 26, 2025Updated last year
- This repository is maintained to release dataset and models for multimodal puzzle reasoning.โ113Feb 26, 2025Updated last year
- ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning & ReCall: Learning to Reason with Tool Call for LLMs via Reiโฆโ1,328May 16, 2025Updated 9 months ago
- Code for paper called Self-Training Elicits Concise Reasoning in Large Language Modelsโ42Apr 22, 2025Updated 10 months ago
- Understanding R1-Zero-Like Training: A Critical Perspectiveโ1,219Aug 27, 2025Updated 6 months ago
- Scalable RL solution for advanced reasoning of language modelsโ1,811Mar 18, 2025Updated 11 months ago
- โ22Jun 10, 2025Updated 8 months ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"โ27Oct 14, 2025Updated 4 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ273Apr 26, 2024Updated last year
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasksโ261May 5, 2025Updated 10 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Frameworkโ71Jun 1, 2025Updated 9 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruningโ97Feb 21, 2025Updated last year
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"โ29Feb 23, 2026Updated last week
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoningโ284Sep 25, 2025Updated 5 months ago
- โ325Jul 25, 2024Updated last year
- โ23Jul 5, 2024Updated last year
- Repo of paper "Free Process Rewards without Process Labels"โ169Mar 14, 2025Updated 11 months ago
- โ74Jun 28, 2025Updated 8 months ago
- โ47Apr 9, 2025Updated 10 months ago