XanderJC / attention-based-credit
Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt, and Mihaela van der Schaar
☆17Updated last month
Related projects: ⓘ
- Rewarded soups official implementation☆43Updated 11 months ago
- Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)☆37Updated 2 months ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆49Updated 8 months ago
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆54Updated 2 months ago
- Guide Your Agent with Adaptive Multimodal Rewards (NeurIPS 2023 Accepted)☆32Updated 11 months ago
- Implementation of ICML 2023 paper: Future-conditioned Unsupervised Pretraining for Decision Transformer☆25Updated last year
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆38Updated last month
- Official codebase for "The Generalization Gap in Offline Reinforcement Learning" accepted to ICLR 2024☆26Updated last month
- Implements the Messenger environment and EMMA model.☆22Updated last year
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆23Updated 9 months ago
- Official PyTorch implementation of "Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning" (NeurIPS 20…☆27Updated last month
- Clean, extensible implementation of MACAW [ICML 2021]☆10Updated 2 years ago
- Official code repository for Prompt-DT.☆93Updated 2 years ago
- Direct preference optimization with f-divergences.☆11Updated last week
- ICLR 2021: "Monte-Carlo Planning and Learning with Language Action Value Estimates"☆30Updated 9 months ago
- Official code for "Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning".☆37Updated 5 months ago
- ☆27Updated last year
- ☆11Updated last year
- Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learni…☆40Updated 3 months ago
- Code for "Task-Agnostic Continual RL: In Praise of a Simple Baseline"☆30Updated last year
- Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)☆64Updated 2 years ago
- ☆15Updated 11 months ago
- Related papers for offline reforcement learning (we mainly focus on representation and sequence modeling and conventional offline RL)☆17Updated 2 years ago
- Code for Posterior Sampling for Deep Reinforcement Learning, ICML 2023☆23Updated 6 months ago
- (NeurIPS '22) LISA: Learning Interpretable Skill Abstractions - A framework for unsupervised skill learning using Imitation☆24Updated last year
- Official PyTorch Implementation for Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning☆14Updated last year
- PyTorch code accompanying the paper "Imitating Graph-Based Planning with Goal-Conditioned Policies" (ICLR 2023).☆19Updated last year
- The official implementation of the paper "Deep Reinforcement Learning with Task-Adaptive Retrieval via Hypernetwork".☆11Updated 6 months ago
- ☆21Updated 8 months ago
- Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning (NeurIPS 2020)☆39Updated 3 years ago