tongyx361 / symevalLinks
Evaluation utilities based on SymPy.
โ20Updated 11 months ago
Alternatives and similar repositories for symeval
Users that are interested in symeval are comparing it to the libraries listed below
Sorting:
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"โ177Updated 6 months ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ116Updated 11 months ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generationโ32Updated last month
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"โ237Updated 2 months ago
- โ67Updated 7 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ267Updated last year
- โ48Updated 3 months ago
- The rule-based evaluation subset and code implementation of Omni-MATHโ25Updated 11 months ago
- Async pipelined version of Verlโ125Updated 7 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".โ115Updated 3 months ago
- The official repository of the Omni-MATH benchmark.โ88Updated 11 months ago
- GenRM-CoT: Data release for verification rationalesโ67Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionโ52Updated last year
- โ52Updated 6 months ago
- โ216Updated 8 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.โ91Updated last month
- Resources for the Enigmata Project.โ73Updated 3 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracyโ76Updated last month
- โ76Updated last year
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"โ125Updated last month
- Repo of paper "Free Process Rewards without Process Labels"โ166Updated 8 months ago
- LeetCode Training and Evaluation Datasetโ40Updated 7 months ago
- Collection of papers for scalable automated alignment.โ94Updated last year
- โ69Updated last year
- โ29Updated last month
- โ52Updated 8 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningโ116Updated 6 months ago
- โ20Updated 4 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied witโฆโ145Updated last year
- Reproducing R1 for Code with Reliable Rewardsโ272Updated 6 months ago