Zayne-sprague / To-CoT-or-not-to-CoTLinks

☆26

Alternatives and similar repositories for To-CoT-or-not-to-CoT

Users that are interested in To-CoT-or-not-to-CoT are comparing it to the libraries listed below

Sorting:

yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆62Updated 8 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆81Updated last month
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
rookie-joe / AutoPSV
☆48Updated 9 months ago
liziniu / GEM
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆35Updated 2 months ago
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆125Updated 4 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆58Updated 2 weeks ago
RUCAIBox / RLMEC
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆38Updated last year
zhiyuanhubj / UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
☆100Updated last year
JayZhang42 / SLED
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433
☆28Updated 8 months ago
alisawuffles / proxy-tuning
Code associated with Tuning Language Models by Proxy (Liu et al., 2024)
☆114Updated last year
NineAbyss / S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆69Updated 3 months ago
Raibows / CREAM
Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.
☆24Updated 5 months ago
BunsenFeng / AbstainQA
AbstainQA, ACL 2024
☆27Updated 9 months ago
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆58Updated 7 months ago
mathllm / Step-Controlled_DPO
☆22Updated last year
ShujinWu-0814 / ALOE
Public code repo for COLING 2025 paper "Aligning LLMs with Individual Preferences via Interaction"
☆33Updated 4 months ago
junkangwu / beta-DPO
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆45Updated 9 months ago
WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆61Updated 9 months ago
uservan / ThinkPO
☆18Updated this week
MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…
☆46Updated 2 months ago
LiuAmber / RAHF
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆25Updated 10 months ago
KbsdJames / omni-math-rule
The rule-based evaluation subset and code implementation of Omni-MATH
☆22Updated 7 months ago
TianHongZXY / RLVR-Decomposed
Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆83Updated 3 weeks ago
YJiangcm / LTE
[ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing
☆36Updated 11 months ago
lemon-prog123 / LongRePS
Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision
☆16Updated 4 months ago
Scarelette / CulturePark
☆24Updated 9 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 8 months ago
fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆32Updated last year
LightChen233 / reasoning-boundary
☆67Updated last month