WEIRDLabUW / vpl_llmLinks
Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"
☆26Updated last year
Alternatives and similar repositories for vpl_llm
Users that are interested in vpl_llm are comparing it to the libraries listed below
Sorting:
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆78Updated 6 months ago
- Direct preference optimization with f-divergences.☆15Updated last year
- A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.☆46Updated 11 months ago
- Code for paper: Reward Uncertainty for Exploration in Preference-based Reinforcement Learning☆15Updated 3 years ago
- Implemention of the Decision-Pretrained Transformer (DPT) from the paper Supervised Pretraining Can Learn In-Context Reinforcement Learni…☆81Updated last year
- Rewarded soups official implementation☆62Updated 2 years ago
- Code for Contrastive Preference Learning (CPL)☆177Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆198Updated 8 months ago
- Code for the paper "PLASTIC: Improving Input and Label Plasticity for Sample Efficient Reinforcement Learning" (NeurIPS 2023)☆20Updated 2 years ago
- Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —☆260Updated 3 months ago
- ☆108Updated last year
- A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)☆192Updated 4 months ago
- A comprehensive list of PAPERS, CODEBASES, and, DATASETS on Decision Making using Foundation Models including LLMs and VLMs.☆383Updated last year
- Official code repository for Prompt-DT.☆119Updated 3 years ago
- ☆51Updated 3 years ago
- AdaPlanner: Language Models for Decision Making via Adaptive Planning from Feedback☆123Updated 8 months ago
- Official code for "Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning".☆53Updated last year
- Preference Transformer: Modeling Human Preferences using Transformers for RL (ICLR2023 Accepted)☆166Updated 2 years ago
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆92Updated last year
- Reasoning with Language Model is Planning with World Model☆183Updated 2 years ago
- Tracking literature and additional online resources on transformers for sequential decision making including RL and beyond.☆49Updated 3 years ago
- Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning (ICML 2024)☆18Updated last year
- Paper collections of the continuous effort start from World Models.☆191Updated last year
- ☆33Updated last year
- Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)☆42Updated last year
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆38Updated last year
- A collection of LLM with RL papers☆278Updated last year
- An extensible benchmark for evaluating large language models on planning☆435Updated 3 months ago
- We perform functional grounding of LLMs' knowledge in BabyAI-Text☆276Updated last month
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆43Updated 10 months ago