wassname / quiet-starLinks

investigate Quiet-STaR paper, and it's thought scratchpad

☆13

Alternatives and similar repositories for quiet-star

Users that are interested in quiet-star are comparing it to the libraries listed below

Sorting:

ChangyuChen347 / MaskedThought
[ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
☆21Updated last year
test-time-interaction / TTI
☆48Updated last month
LAMDASZ-ML / Self-Backtracking
☆47Updated 5 months ago
dinobby / MAGDi
The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…
☆35Updated last year
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆108Updated 5 months ago
fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆32Updated last year
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆49Updated 8 months ago
open-compass / GPassK
[ACL 2025] Are Your LLMs Capable of Stable Reasoning?
☆26Updated 3 months ago
WindyLee0822 / Process_Q_Model
official implementation of paper "Process Reward Model with Q-value Rankings"
☆60Updated 5 months ago
GAIR-NLP / OlympicArena
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆102Updated 4 months ago
kyleliang919 / Online-Subspace-Descent
[NeurIPS 2024] Low rank memory efficient optimizer without SVD
☆30Updated 2 weeks ago
mathllm / Step-Controlled_DPO
☆22Updated last year
mandyyyyii / east
☆20Updated 3 months ago
schauppi / Self-Rewarding-Language-Models
☆46Updated last year
menhguin / minp_paper
Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper
☆39Updated 4 months ago
RenzeLou / AAAR-1.0
The source code for running LLMs on the AAAR-1.0 benchmark.
☆16Updated 3 months ago
xufangzhi / phi-Decoding
[ACL 2025] An inference-time decoding strategy with adaptive foresight sampling
☆99Updated last month
mathllm / MathCoder2
☆63Updated 9 months ago
microsoft / tale-suite
Text Adventure Learning Environment Suite - Benchmark to evaluate language models on interactive text environments.
☆18Updated last month
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆74Updated 3 months ago
yale-nlp / MCTS-RAG
☆57Updated 2 weeks ago
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆50Updated last month
jdf-prog / LLM-Engines
☆50Updated last month
dvlab-research / ARPO
Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay
☆88Updated last month
allenai / easy-to-hard-generalization
Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"
☆48Updated last year
zjunlp / unlearn
[ACL 2025] Knowledge Unlearning for Large Language Models
☆38Updated 2 months ago
SihengLi99 / SEALONG
Large Language Models Can Self-Improve in Long-context Reasoning
☆71Updated 7 months ago
hamishivi / automated-instruction-selection
Exploration of automated dataset selection approaches at large scales.
☆47Updated 4 months ago
Infini-AI-Lab / S2FT
☆18Updated 6 months ago
JieyuZ2 / TaskMeAnything
[NeurIPS 2024] A task generation and model evaluation system for multimodal language models.
☆71Updated 7 months ago