ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆89Updated 3 weeks ago
Alternatives and similar repositories for BoLT:
Users that are interested in BoLT are comparing it to the libraries listed below
- Repo of paper "Free Process Rewards without Process Labels"☆141Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆120Updated 7 months ago
- Code for "A Sober Look at Progress in Language Model Reasoning" paper☆33Updated this week
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆47Updated last month
- ☆107Updated 2 weeks ago
- ☆66Updated last month
- ☆106Updated 3 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆54Updated 2 months ago
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆54Updated 6 months ago
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆29Updated 6 months ago
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…☆25Updated 4 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆92Updated last month
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆101Updated 4 months ago
- ☆78Updated this week
- ☆91Updated last month
- ☆154Updated 3 weeks ago
- A brief and partial summary of RLHF algorithms.☆127Updated last month
- GenRM-CoT: Data release for verification rationales☆56Updated 6 months ago
- Repo for "Z1: Efficient Test-time Scaling with Code"☆53Updated last week
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆59Updated last month
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆190Updated last month
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆51Updated 3 weeks ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆175Updated last month
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆49Updated 4 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆66Updated 2 months ago
- ☆185Updated 2 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆131Updated 7 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 5 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆114Updated 3 weeks ago
- ☆51Updated last week