GraphPKU / number_cookbookLinks
Official repository for the paper Number Cookbook: Number Understanding of Language Models and How to Improve It.
☆19Updated 9 months ago
Alternatives and similar repositories for number_cookbook
Users that are interested in number_cookbook are comparing it to the libraries listed below
Sorting:
- ☆346Updated 5 months ago
- Reinforcing General Reasoning without Verifiers☆93Updated 6 months ago
- ☆50Updated 10 months ago
- ☆118Updated 3 weeks ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆65Updated 11 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆96Updated 10 months ago
- Official implementation of paper "Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models"☆56Updated 2 weeks ago
- Discriminative Constrained Optimization for Reinforcing Large Reasoning Models☆49Updated 2 months ago
- A repo for open research on building large reasoning models☆126Updated 2 weeks ago
- A Sober Look at Language Model Reasoning☆92Updated last month
- Code for "Reasoning to Learn from Latent Thoughts"☆124Updated 9 months ago
- Code for "Variational Reasoning for Language Models"☆54Updated 3 months ago
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆97Updated 3 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆328Updated 2 months ago
- ThetaEvolve: Test-time Learning on Open Problems, enabling RL training on AlphaEvolve/OpenEvolve and emphasizing scaling test-time comput…☆93Updated last week
- Demystifying Reinforcement Learning in Agentic Reasoning☆146Updated 2 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆55Updated this week
- Repo of paper "Free Process Rewards without Process Labels"☆168Updated 9 months ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆380Updated 3 weeks ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆182Updated 5 months ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆48Updated this week
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆110Updated last month
- ☆176Updated last month
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆191Updated 10 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 7 months ago
- repo for paper https://arxiv.org/abs/2504.13837☆310Updated 3 weeks ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆113Updated 5 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆98Updated last year
- One-shot Entropy Minimization☆187Updated 6 months ago