Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
☆55Nov 29, 2024Updated last year
Alternatives and similar repositories for MathCritique
Users that are interested in MathCritique are comparing it to the libraries listed below
Sorting:
- ☆23Jul 5, 2024Updated last year
- ☆25Aug 23, 2024Updated last year
- ☆14Dec 25, 2024Updated last year
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- ☆24Oct 31, 2025Updated 4 months ago
- About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…☆13Jan 14, 2026Updated last month
- ☆56Mar 6, 2025Updated 11 months ago
- ScreenExplorer: Training a Vision-Language Model for Diverse Exploration in Open GUI World☆24Jun 17, 2025Updated 8 months ago
- ☆14May 4, 2024Updated last year
- Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"☆62Nov 18, 2025Updated 3 months ago
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆692Jan 20, 2025Updated last year
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆184May 20, 2025Updated 9 months ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated last year
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆14Jun 21, 2024Updated last year
- ☆14Jan 10, 2024Updated 2 years ago
- A series of technical report on Slow Thinking with LLM☆760Aug 13, 2025Updated 6 months ago
- Large Reasoning Models☆807Dec 3, 2024Updated last year
- GenRM-CoT: Data release for verification rationales☆68Oct 16, 2024Updated last year
- ☆18Jan 3, 2025Updated last year
- [ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling☆32Feb 1, 2026Updated 3 weeks ago
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆391Jan 19, 2025Updated last year
- Vehicle detection based on YOLO and SVM☆15Jan 29, 2018Updated 8 years ago
- [ACL 2025] "World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning." https://arxiv.org/abs/2503.1…☆17Jul 22, 2025Updated 7 months ago
- UnifiedToolHub is a comprehensive project supporting LLM-based tool use, designed to unify various tool-use dataset formats and provide t…☆19Jul 23, 2025Updated 7 months ago
- Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt…☆38Aug 11, 2024Updated last year
- ☆72Jun 10, 2025Updated 8 months ago
- [ICLR 2025 Oral] Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition☆18Nov 25, 2024Updated last year
- ☆20Oct 10, 2025Updated 4 months ago
- ☆17May 19, 2023Updated 2 years ago
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆16Jan 24, 2025Updated last year
- ☆27Sep 11, 2024Updated last year
- Repository for the EMNLP 2023 Demo Paper "Reaction Miner: An Integrated System for Chemical Reaction Extraction from Textual Data"☆19Jan 27, 2025Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 9 months ago
- ☆321Sep 18, 2024Updated last year
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- ☆25Dec 12, 2025Updated 2 months ago
- Data and Code for the paper "FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains"☆24Aug 10, 2024Updated last year
- ☆21Jul 25, 2025Updated 7 months ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year