Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"
☆46Feb 20, 2025Updated last year
Alternatives and similar repositories for Generalizable-Reward-Model
Users that are interested in Generalizable-Reward-Model are comparing it to the libraries listed below
Sorting:
- Score and Distribution Matching Policy: Advanced accelerated Visuomotor Policies via matched distillation☆10May 9, 2025Updated 9 months ago
- Official repo for arxiv paper "Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion I…☆17Nov 8, 2024Updated last year
- ☆63Jan 30, 2026Updated last month
- Official code for "Pretraining Representations For Data-Efficient Reinforcement Learning" (NeurIPS 2021)☆55Jul 27, 2021Updated 4 years ago
- Minimal Decision Transformer Implementation written in Jax (Flax).☆17Aug 8, 2022Updated 3 years ago
- ☆17Sep 28, 2023Updated 2 years ago
- ☆19Oct 2, 2023Updated 2 years ago
- [ICML 2025 Oral] Official repo of EmbodiedBench, a comprehensive benchmark designed to evaluate MLLMs as embodied agents.☆266Feb 20, 2026Updated last week
- ☆91May 31, 2025Updated 9 months ago
- Code for NeurIPS 2021 paper "Offline Reinforcement Learning with Reverse Model-based Imagination"☆19Dec 22, 2021Updated 4 years ago
- An AI benchmark for Pokémon VGC with agent implementations using multi-agent reinforcement learning, behavior cloning, LLMs, and heuristi…☆30Feb 20, 2026Updated last week
- Official code for "Decoding-Time Language Model Alignment with Multiple Objectives".☆29Oct 30, 2024Updated last year
- ☆27Sep 22, 2025Updated 5 months ago
- [NeurIPS 2025] BOOM, A Planning-driven Model-Based RL algorithm☆58Feb 4, 2026Updated 3 weeks ago
- Official codebase for GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning.☆29Nov 12, 2024Updated last year
- Official code repo for paper: Hybrid RL: Using both offline and online data can make RL efficient.☆25Feb 16, 2023Updated 3 years ago
- Code for T-MARS data filtering☆35Aug 23, 2023Updated 2 years ago
- ☆120Feb 25, 2025Updated last year
- Pytorch implementation of "Succinct and Robust Multi-Agent Communication With Temporal Message Control"☆28Dec 6, 2020Updated 5 years ago
- Fine-tuning large language models with huggingface transformers and deepspeed☆31Dec 11, 2023Updated 2 years ago
- An open source benchmark for Multi Agent Reinforcement Learning☆31Jul 15, 2023Updated 2 years ago
- The core repository of the elsciRL framework.☆18Dec 8, 2025Updated 2 months ago
- Sparse Backpropagation for Mixture-of-Expert Training☆29Jul 2, 2024Updated last year
- A JAX Implementation of the Twin Delayed DDPG Algorithm☆35Mar 12, 2020Updated 5 years ago
- ☆31Jun 21, 2024Updated last year
- ☆33Oct 31, 2024Updated last year
- Benchmark data (i.e., DeepMind Control Suite and MuJoCo) for RL.☆33Jan 23, 2021Updated 5 years ago
- Code for the paper Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance, accepted to CoRL 2023 as an…☆35Jul 15, 2025Updated 7 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- PyTorch implementation of our paper Reinforcement Learning with Random Delays (ICLR 2020)☆42May 25, 2022Updated 3 years ago
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆28Updated this week
- Repository of IPBench☆19Jan 4, 2026Updated last month
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆44Apr 21, 2024Updated last year
- ☆52Oct 23, 2023Updated 2 years ago
- ☆10Nov 17, 2022Updated 3 years ago
- Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).☆244Dec 11, 2025Updated 2 months ago
- A collection of some awesome public projects about LLM-based Web Agents and Tools.☆12Apr 25, 2024Updated last year
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 4 months ago
- ☆11Jul 17, 2023Updated 2 years ago