StevenZHB / CoT_Causal_AnalysisLinks

Repository of paper "How Likely Do LLMs with CoT Mimic Human Reasoning?"

☆23

Alternatives and similar repositories for CoT_Causal_Analysis

Users that are interested in CoT_Causal_Analysis are comparing it to the libraries listed below

Sorting:

fangyuan-ksgk / CoT-Reasoning-without-Prompting
Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting
☆33Updated last year
MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…
☆46Updated 5 months ago
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆109Updated 3 months ago
causalNLP / corr2cause
Data and code for the Corr2Cause paper (ICLR 2024)
☆111Updated last year
activatedgeek / calibration-tuning
☆52Updated 7 months ago
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆29Updated last year
bowen-upenn / llm_token_bias
[EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
☆25Updated 10 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆62Updated 11 months ago
zchuz / TimeBench
The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"
☆32Updated last year
Jiuzhouh / Uncertainty-Aware-Language-Agent
This is the official repo for Towards Uncertainty-Aware Language Agent.
☆29Updated last year
tatsu-lab / test_set_contamination
☆41Updated 2 years ago
GAIR-NLP / MetaCritique
Evaluate the Quality of Critique
☆36Updated last year
psunlpgroup / ReaLMistake
This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".
☆30Updated last year
sail-sg / CPO
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
☆131Updated 7 months ago
stanfordnlp / axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
☆138Updated 4 months ago
GraphPKU / Case_or_Rule
exploring whether LLMs perform case-based or rule-based reasoning
☆30Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆66Updated 11 months ago
Zayne-sprague / To-CoT-or-not-to-CoT
☆25Updated 7 months ago
casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆59Updated last year
da03 / implicit_chain_of_thought
☆139Updated 11 months ago
SalesforceAIResearch / FoFo
☆27Updated 9 months ago
qcznlp / uncertainty_attack
☆21Updated 2 months ago
WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆63Updated last year
QingruZhang / PASTA
PASTA: Post-hoc Attention Steering for LLMs
☆127Updated 11 months ago
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆110Updated 2 years ago
LAMDASZ-ML / Self-Backtracking
☆50Updated 8 months ago
tianyang-x / SaySelf
Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"
☆109Updated last year
YuxiXie / SelfEval-Guided-Decoding
☆103Updated last year
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆125Updated last year