interp-reasoning / thought-anchorsLinks
⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.
☆48Updated 2 weeks ago
Alternatives and similar repositories for thought-anchors
Users that are interested in thought-anchors are comparing it to the libraries listed below
Sorting:
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆101Updated last month
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆102Updated 3 weeks ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆39Updated 4 months ago
- ☆26Updated last year
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆66Updated last year
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆77Updated 7 months ago
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆35Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆112Updated last year
- ☆66Updated last year
- ☆50Updated last month
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆58Updated 5 months ago
- [NeurIPS 2024] Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study☆52Updated 7 months ago
- ☆50Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆96Updated last month
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆38Updated last year
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆115Updated 9 months ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆75Updated last month
- ☆55Updated 3 weeks ago
- [NeurIPS 2024] Knowledge Circuits in Pretrained Transformers☆149Updated 5 months ago
- Process Reward Models That Think☆46Updated 2 weeks ago
- ☆27Updated last year
- Reinforcing General Reasoning without Verifiers☆71Updated 3 weeks ago
- Repository for the paper Stream of Search: Learning to Search in Language☆149Updated 5 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆91Updated 2 months ago
- ☆33Updated 2 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆60Updated 5 months ago
- ☆117Updated 4 months ago
- ☆38Updated 5 months ago
- Governance of the Commons Simulation (GovSim)☆55Updated 6 months ago