facebookresearch / iGSM
The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Process" (arxiv 2407.20311) and "Physics of Language Models Part 2.2, How to Learn From Mistakes on Grade-School Math Problems" (arxiv 2408.16293)
☆45Updated 4 months ago
Alternatives and similar repositories for iGSM
Users that are interested in iGSM are comparing it to the libraries listed below
Sorting:
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆58Updated last month
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆29Updated 7 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆94Updated last month
- Directional Preference Alignment☆57Updated 7 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 8 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆60Updated 4 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆52Updated 2 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆63Updated last month
- Code accompanying the paper "Noise Contrastive Alignment of Language Models with Explicit Rewards" (NeurIPS 2024)☆51Updated 6 months ago
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆54Updated 7 months ago
- ☆50Updated 3 months ago
- ☆36Updated last month
- Revisiting Mid-training in the Era of RL Scaling☆37Updated 2 weeks ago
- Exploration of automated dataset selection approaches at large scales.☆40Updated 2 months ago
- [NeurIPS 2024 Spotlight] Code and data for the paper "Finding Transformer Circuits with Edge Pruning".☆54Updated 2 months ago
- ☆69Updated this week
- ☆85Updated last year
- We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆62Updated 6 months ago
- ☆66Updated 5 months ago
- LightThinker: Thinking Step-by-Step Compression☆44Updated last month
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆71Updated 2 years ago
- ☆56Updated last week
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆134Updated 7 months ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆54Updated 2 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆57Updated 3 months ago
- ☆67Updated last year
- ☆138Updated 5 months ago
- Deepseek R1 zero tiny version own reproduce on two A100s.☆66Updated 3 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆58Updated 5 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆146Updated last month