guyuntian / CoT_benchmarkLinks

Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"

☆20

Alternatives and similar repositories for CoT_benchmark

Users that are interested in CoT_benchmark are comparing it to the libraries listed below

Sorting:

Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆124Updated last year
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆89Updated 2 weeks ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆122Updated 8 months ago
kiaia / GIRAFFE
Extending context length of visual language models
☆12Updated 11 months ago
glorgao / SelectiveDPO
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
☆44Updated 4 months ago
hkust-nlp / PEM_composition
[NeurIPS 2023] Github repository for "Composing Parameter-Efficient Modules with Arithmetic Operations"
☆61Updated 2 years ago
sail-sg / variational-reasoning
Code for "Variational Reasoning for Language Models"
☆52Updated 2 months ago
haozheji / exact-optimization
ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment
☆57Updated last year
abhishekpanigrahi1996 / Skill-Localization-by-grafting
☆51Updated last year
genrm-star / genrm-critiques
GenRM-CoT: Data release for verification rationales
☆66Updated last year
chang-github-00 / LLM-Predictive-Decoding
☆14Updated 4 months ago
deeplearning-wisc / args
☆46Updated last year
TianHongZXY / RLVR-Decomposed
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆127Updated last month
tml-epfl / long-is-more-for-alignment
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning [ICML 2024]
☆19Updated last year
wenhuchen / ML-Interview
Preparing for ML Interviews.
☆41Updated 2 weeks ago
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 7 months ago
dannyallover / overthinking_the_truth
☆29Updated last year
RLHFlow / RAFT
This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…
☆37Updated last year
siyan-zhao / ICL_decision_boundary
official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233…
☆19Updated 4 months ago
sail-sg / Attention-Sink
[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)
☆142Updated 4 months ago
jihoontack / MAC
Online Adaptation of Language Models with a Memory of Amortized Contexts (NeurIPS 2024)
☆70Updated last year
sail-sg / ActivePRM
☆19Updated 7 months ago
roeehendel / icl_task_vectors
☆102Updated 2 years ago
RLHFlow / Directional-Preference-Alignment
Directional Preference Alignment
☆58Updated last year
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆63Updated last year
socialfoundations / tttlm
Test-time-training on nearest neighbors for large language models
☆48Updated last year
activatedgeek / calibration-tuning
☆52Updated 7 months ago
princeton-nlp / unintentional-unalignment
[ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
☆31Updated 10 months ago
tmlr-group / NoisyRationales
[NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"
☆37Updated 4 months ago
holarissun / RewardModelingBeyondBradleyTerry
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and…
☆69Updated 8 months ago