spinbench / spinbenchLinks
☆28Updated 5 months ago
Alternatives and similar repositories for spinbench
Users that are interested in spinbench are comparing it to the libraries listed below
Sorting:
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆44Updated 5 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆124Updated 10 months ago
- Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".☆116Updated 6 months ago
- ☆117Updated last year
- ☆224Updated 10 months ago
- A repo for open research on building large reasoning models☆136Updated last week
- A Sober Look at Language Model Reasoning☆92Updated 2 months ago
- official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…☆70Updated 10 months ago
- [NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning☆50Updated 3 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.☆134Updated 10 months ago
- Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""☆29Updated 3 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆54Updated 7 months ago
- GenRM-CoT: Data release for verification rationales☆67Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆124Updated last year
- ☆229Updated last month
- FeatureAlignment = Alignment + Mechanistic Interpretability☆34Updated 11 months ago
- ☆20Updated last year
- ☆73Updated 9 months ago
- ☆32Updated last year
- ☆352Updated 6 months ago
- Reinforcing General Reasoning without Verifiers☆96Updated 7 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆168Updated 10 months ago
- Rewarded soups official implementation☆62Updated 2 years ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆260Updated 9 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆120Updated last week
- [NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"☆160Updated 3 months ago
- [COLM 2025] SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆51Updated 10 months ago
- Repo for Anonymous purpose, pls don't distribute☆10Updated last year
- A brief and partial summary of RLHF algorithms.☆144Updated 11 months ago
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆21Updated 3 months ago