WEIRDLabUW / vpl_llmLinks

Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"

☆20

Alternatives and similar repositories for vpl_llm

Users that are interested in vpl_llm are comparing it to the libraries listed below

Sorting:

facebookresearch / ssorl
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
☆42Updated last year
abaheti95 / LoL-RL
Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients
☆26Updated 9 months ago
chongyi-zheng / td_infonce
Implementations of Temporal Difference InfoNCE (TD InfoNCE)
☆29Updated last year
ZhaolinGao / REBEL
Reinforcement Learning via Regressing Relative Rewards
☆34Updated 6 months ago
cassidylaidlaw / hidden-context
Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"
☆29Updated last year
younggyoseo / trajectory_mcl
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning (NeurIPS 2020)
☆39Updated 4 years ago
prajjwal1 / rl_paradigm
☆17Updated last year
amazon-science / replay-based-recurrent-rl
Code for "Task-Agnostic Continual RL: In Praise of a Simple Baseline"
☆34Updated 2 years ago
snu-mllab / DPPO
Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)
☆42Updated 11 months ago
Kaixhin / GUDRL
Generalised UDRL
☆37Updated 3 years ago
csmile-1006 / ARP
Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)
☆33Updated last year
ml-jku / L2M
Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)
☆58Updated 8 months ago
sfujim / SR-DICE
Author's PyTorch implementation of SR-DICE for marginalized importance sampling
☆17Updated 3 years ago
htdt / lwm
Latent World Models For Intrinsically Motivated Exploration | Official repository
☆22Updated 4 years ago
micahcarroll / uniMASK
Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"
☆55Updated 11 months ago
adaptive-intelligent-robotics / QDAC
Repository for "Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics" …
☆16Updated last year
dido1998 / CausalMBRL
Official data and code for our paper Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning
☆48Updated 3 years ago
liziniu / policy_optimization
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆28Updated last year
LunjunZhang / world-model-as-a-graph
Code for "World Model as a Graph: Learning Latent Landmarks for Planning" (ICML 2021 Long Presentation)
☆66Updated 3 years ago
mila-iqia / SGI
Official code for "Pretraining Representations For Data-Efficient Reinforcement Learning" (NeurIPS 2021)
☆54Updated 3 years ago
yiqiwang8177 / Official-codebase-for-Decision-Transducer
This is the pytorch implementation of the UAI2023 paper "A Trajectory is Worth Three Sentences: Multimodal Transformer for Offline Reinf…
☆11Updated last year
vivekmyers / contrastive_planning
Code for the paper "Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference"
☆42Updated 11 months ago
princeton-nlp / lwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…
☆24Updated last year
eric-mitchell / macaw-min
Clean, extensible implementation of MACAW [ICML 2021]
☆12Updated 3 years ago
Cranial-XIX / metric-residual-network
Official PyTorch Implementation for Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning
☆17Updated 2 years ago
phlippe / CITRIS
Code repository of the paper "CITRIS: Causal Identifiability from Temporal Intervened Sequences" and "iCITRIS: Causal Representation Lear…
☆53Updated 2 years ago
facebookresearch / gen_dgrl
Official codebase for "The Generalization Gap in Offline Reinforcement Learning" accepted to ICLR 2024
☆28Updated 10 months ago
mila-iqia / Skipper
A PyTorch Implementation of Skipper
☆28Updated 9 months ago
yudasong / briee
Representation Learning in RL
☆15Updated 3 years ago
taodav / nsrs
Code for the paper Novelty Search in Representational Space for Sample Efficient Exploration presented at NeurIPS 2020.
☆14Updated 11 months ago