ZhaolinGao / REBEL
☆28Updated last month
Alternatives and similar repositories for REBEL:
Users that are interested in REBEL are comparing it to the libraries listed below
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 4 months ago
- ☆75Updated 6 months ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆28Updated last year
- ☆25Updated 8 months ago
- ☆15Updated 11 months ago
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆45Updated last month
- Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)☆41Updated 5 months ago
- ☆43Updated 2 weeks ago
- PyTorch Package For Quasimetric Learning☆41Updated 2 months ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆52Updated 3 months ago
- JAX implementation of VQVAE/VQGAN autoencoders (+FSQ)☆24Updated 7 months ago
- Code for "Unsupervised Zero-Shot RL via Functional Reward Representations"☆54Updated 9 months ago
- Learn online intrinsic rewards from LLM feedback☆33Updated last month
- Dateset Reset Policy Optimization☆28Updated 9 months ago
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆54Updated 6 months ago
- Official code implementation for the work Preference Alignment with Flow Matching (NeurIPS 2024)☆20Updated 2 months ago
- Minimal but scalable implementation of large language models in JAX☆28Updated 2 months ago
- Official codebase for "The Generalization Gap in Offline Reinforcement Learning" accepted to ICLR 2024☆29Updated 5 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 9 months ago
- ICML 2022: Learning Iterative Reasoning through Energy Minimization☆44Updated last year
- VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)☆12Updated this week
- Code for the paper "Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference"☆39Updated 6 months ago
- Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…☆27Updated last year
- Implementation of Direct Preference Optimization☆15Updated last year
- ☆26Updated 2 months ago
- Rewarded soups official implementation☆54Updated last year
- We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effe…☆20Updated 11 months ago
- ☆26Updated 2 months ago
- [ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)☆19Updated 4 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆27Updated 6 months ago