WANGXinyiLinda / LM_random_walkLinks

Official code for paper Understanding the Reasoning Ability of Language Models From the Perspective of Reasoning Paths Aggregation

☆20

Alternatives and similar repositories for LM_random_walk

Users that are interested in LM_random_walk are comparing it to the libraries listed below

Sorting:

Shentao-YANG / Preference_Grounded_Guidance
Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).
☆17Updated 10 months ago
dannyallover / overthinking_the_truth
☆29Updated last year
Mihir3009 / LogicBench
LogicBench is a natural language question-answering dataset consisting of 25 different reasoning patterns spanning over propositional, fi…
☆32Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated last year
ruiqi-zhong / nlparam
Augmenting Statistical Models with Natural Language Parameters
☆29Updated last year
deeplearning-wisc / args
☆46Updated last year
ZhaofengWu / counterfactual-evaluation
☆57Updated 6 months ago
Edward-Sun / easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
☆124Updated last year
FreedomIntelligence / OVM
☆69Updated last year
sail-sg / lm-random-memory-access
☆15Updated last year
princeton-nlp / MQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions
☆119Updated last year
roeehendel / icl_task_vectors
☆101Updated 2 years ago
genrm-star / genrm-critiques
GenRM-CoT: Data release for verification rationales
☆67Updated last year
xiye17 / TextualExplInContext
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)
☆16Updated 2 years ago
cicl-stanford / procedural-evals-tom
☆35Updated 2 years ago
BunsenFeng / AbstainQA
AbstainQA, ACL 2024
☆28Updated last year
YuxiXie / SelfEval-Guided-Decoding
☆103Updated last year
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆84Updated 8 months ago
WeiminXiong / IPR
Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)
☆63Updated last year
Zhou-Zoey / RMB-Reward-Model-Benchmark
☆46Updated 8 months ago
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆110Updated 2 years ago
linlu-qiu / lm-inductive-reasoning
☆33Updated 2 years ago
ADaM-BJTU / W2SG
The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”
☆17Updated last year
iQua / llmpebase
This is a unified platform for implementing and evaluating test-time reasoning mechanisms in Large Language Models (LLMs).
☆19Updated 10 months ago
WANGXinyiLinda / concept-based-demonstration-selection
Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…
☆75Updated last year
tatsu-lab / linguistic_calibration
Align your LM to express calibrated verbal statements of confidence in its long-form generations.
☆27Updated last year
kaistAI / knowledge-reasoning
[EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…
☆22Updated 11 months ago
chang-github-00 / LLM-Predictive-Decoding
☆14Updated 4 months ago
balevinstein / Probes
☆57Updated 2 years ago
liziniu / GEM
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆48Updated 6 months ago