ymetz / rlhfblender
RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
☆12Updated last week
Alternatives and similar repositories for rlhfblender:
Users that are interested in rlhfblender are comparing it to the libraries listed below
- Implementation of CASCADE in Learning General World Models in a Handful of Reward-Free Deployments (NeurIPS 22).☆29Updated 2 years ago
- Repo to reproduce the First-Explore paper results☆37Updated 3 months ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated this week
- We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…☆22Updated last year
- Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories☆42Updated last year
- OMNI: Open-endedness via Models of human Notions of Interestingness☆43Updated 2 months ago
- Minimal code for A Generalist Agent☆39Updated 2 years ago
- LLM Dynamic Planner - Combining LLM with PDDL Planners to solve an embodied task☆42Updated 3 months ago
- ☆15Updated 2 years ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆54Updated last month
- ☆18Updated 6 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 7 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆29Updated 8 months ago
- Repository for "Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics" …☆16Updated 9 months ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆16Updated 7 months ago
- ☆14Updated last year
- INTeractive learning via REPresentatIon Discovery☆34Updated 10 months ago
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆46Updated 3 months ago
- ☆53Updated 5 months ago
- Official code for "Reward-Free Curricula for Training Robust World Models", ICLR 2024.☆27Updated last year
- Causal Analysis of Agent Behavior for AI Safety☆17Updated last year
- Learning to Identify Critical States for Reinforcement Learning from Videos (Accepted to ICCV'23)☆26Updated last year
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆17Updated last month
- ☆15Updated last year
- ☆17Updated 4 years ago
- A testbed for agents and environments that can automatically improve models through data generation.☆23Updated last month
- Generalised UDRL☆37Updated 2 years ago
- Official code for the paper "Context-Aware Language Modeling for Goal-Oriented Dialogue Systems"☆34Updated 2 years ago
- Official implementation of FIND (NeurIPS '23) Function Interpretation Benchmark and Automated Interpretability Agents☆49Updated 6 months ago
- Official implementation of Zero-Hero paper☆22Updated last month