Walter0807 / RepBeliefLinks

[ICML 2024] Language Models Represent Beliefs of Self and Others

☆33

Alternatives and similar repositories for RepBelief

Users that are interested in RepBelief are comparing it to the libraries listed below

Sorting:

szxiangjn / world-model-for-language-model
☆132Updated last year
rookie-joe / AutoPSV
☆50Updated last year
cicl-stanford / procedural-evals-tom
☆35Updated 2 years ago
sotopia-lab / sotopia-pi
Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)
☆78Updated last year
RLHFlow / Directional-Preference-Alignment
Directional Preference Alignment
☆57Updated last year
ValueCompass / Alignment-Goal-Survey
☆30Updated last year
ying-hui-he / Hi-ToM_dataset
☆15Updated last month
Yifan-Song793 / ETO
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
☆153Updated last year
cathyxl / MAgIC
☆41Updated last year
jianggy / MPI
This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models
☆55Updated last year
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆62Updated 11 months ago
facebookresearch / rlfh-gen-div
This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity
☆47Updated last year
sail-sg / dice
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆44Updated 7 months ago
elated-sawyer / WALL-E
Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
☆52Updated 6 months ago
PKU-Alignment / aligner
[NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct
☆190Updated 10 months ago
Victorwz / LaViA
☆10Updated last year
XiaojuanTang / ICSR
implementation of paper "Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners"
☆20Updated 2 years ago
Timothyxxx / WorldModelPapers
Paper collections of the continuous effort start from World Models.
☆188Updated last year
idanshen / Value-Augmented-Sampling
☆20Updated last year
Mars-tin / awesome-theory-of-mind
Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…
☆146Updated 9 months ago
joeljang / RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
☆111Updated 2 years ago
bigai-nlco / langsuite
Official Repo of LangSuitE
☆84Updated last year
CUHK-ARISE / GAMABench
Benchmarking LLMs' Gaming Ability in Multi-Agent Environments
☆88Updated 6 months ago
holarissun / Prompt-OIRL
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
☆42Updated last year
alexrame / rewardedsoups
Rewarded soups official implementation
☆62Updated 2 years ago
sanowl / Self-Correcting-LLM--Reinforcement-Learning-
This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by g…
☆37Updated 4 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆67Updated 4 months ago
ZHZisZZ / modpo
[ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
☆92Updated last year
Linear95 / APO
Code for ACL2024 paper - Adversarial Preference Optimization (APO).
☆56Updated last year
Linear95 / DSP
Domain-specific preference (DSP) data and customized RM fine-tuning.
☆25Updated last year