protagolabs / odyssey-math
☆75Updated last month
Related projects ⓘ
Alternatives and complementary repositories for odyssey-math
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆97Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated 2 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆73Updated 3 months ago
- A framework for few-shot evaluation of autoregressive language models.☆23Updated 11 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆84Updated 4 months ago
- Can Language Models Solve Olympiad Programming?☆101Updated 3 months ago
- ☆101Updated 5 months ago
- Evaluating Mathematical Reasoning Beyond Accuracy☆37Updated 7 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆62Updated last year
- Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data☆37Updated 5 months ago
- ☆103Updated 4 months ago
- ☆18Updated 2 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆78Updated last month
- ☆89Updated 11 months ago
- Self-Alignment with Principle-Following Reward Models☆147Updated 8 months ago
- ☆71Updated 3 months ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆84Updated 7 months ago
- ☆81Updated last year
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆26Updated 5 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆47Updated 4 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆80Updated last week
- ☆25Updated last month
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆75Updated last month
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆83Updated 4 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆103Updated 6 months ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"☆68Updated 5 months ago
- ☆25Updated 6 months ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆41Updated last year