bin123apple / MACM
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems
โ70Updated 3 months ago
Related projects โ
Alternatives and complementary repositories for MACM
- "Improving Mathematical Reasoning with Process Supervision" by OPENAIโ83Updated last week
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ78Updated last month
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.โ73Updated 3 months ago
- The official repo for "TheoremQA: A Theorem-driven Question Answering dataset" (EMNLP 2023)โ21Updated 6 months ago
- โ50Updated last month
- โ72Updated 5 months ago
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Datasetโ84Updated 4 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ103Updated 6 months ago
- Official implementation of DPFM @ ICLR 2024 paper "Autonomous Data Selection with Language Models for Mathematical Texts" (As Huggingfaceโฆโ79Updated 2 weeks ago
- โ101Updated 5 months ago
- โ116Updated 5 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoningโ24Updated last week
- Mix of Minimal Optimal Sets (MMOS) of dataset has two advantages for two aspects, higher performance and lower construction costs on mathโฆโ69Updated 3 months ago
- A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.โ114Updated 2 months ago
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don'tโฆโ83Updated 4 months ago
- Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)โ42Updated 3 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervisionโ97Updated 2 months ago
- โ34Updated 3 months ago
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.โ94Updated 3 weeks ago
- โ75Updated last month
- Code implementation of synthetic continued pretrainingโ60Updated last month
- The official repository of the Omni-MATH benchmark.โ49Updated 2 weeks ago
- Benchmarking LLMs with Challenging Tasks from Real Usersโ195Updated 2 weeks ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)โ99Updated 3 weeks ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Modelsโ167Updated last month
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)โ199Updated 6 months ago
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factualityโ160Updated 3 months ago
- Reformatted Alignmentโ112Updated last month
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]โ124Updated 3 weeks ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.โ63Updated last month