microsoft / EPPO
An implementation of effective policy ensemble.
☆10Updated last year
Alternatives and similar repositories for EPPO:
Users that are interested in EPPO are comparing it to the libraries listed below
- ☆32Updated 5 months ago
- More efficient exploration for reinforcement learning in two-player, zero-sum game☆19Updated 5 months ago
- ☆36Updated 3 years ago
- Author's PyTorch implementation of SR-DICE for marginalized importance sampling☆15Updated 3 years ago
- Source code for the Self-Paced Deep Reinforcement Learning Experiments☆30Updated last year
- ☆12Updated 2 years ago
- Authors' PyTorch implementation of 'Recomposing the Reinforcement Learning Building-Blocks with Hypernetworks' (HypeRL)☆25Updated 3 years ago
- Official implementation for the paper "Offline Meta RL - Identifiability Challenges and Effective Data Collection Strategies", NeurIPS 20…☆30Updated 3 years ago
- An unofficial implementation for online decision transformer☆39Updated 2 years ago
- 🔍 Codebase for the ICML '20 paper "Ready Policy One: World Building Through Active Learning" (arxiv: 2002.02693)☆18Updated last year
- IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation☆38Updated 2 months ago
- ☆29Updated 2 years ago
- ☆41Updated 3 years ago
- The collection of the research works about Automatic Reinforcement Learning in Microsoft Research Asia.☆49Updated last year
- implementation of Wasserstein Natural Policy Gradients and Wasserstein Natural Evolution Strategies☆10Updated 3 years ago
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 4 years ago
- ☆14Updated 2 years ago
- Model-Based Reinforcement Learning via Latent-Space Collocation.☆32Updated last year
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Updated 3 years ago
- Official implementation of "Know Your Action Set: Learning Action Relations for Reinforcement Learning", Jain et al., ICLR 2022.☆17Updated 2 years ago
- ☆8Updated 2 years ago
- ☆29Updated 2 years ago
- Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"☆17Updated 2 years ago
- Github repo for HIDIO: Hierarchical Reinforcement Learning by Discovering Intrinsic Options☆44Updated 3 years ago
- Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)☆66Updated 2 years ago
- Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238☆44Updated 4 years ago
- Author's PyTorch Implementation of Deep Homomorphic Policy Gradient (DHPG) - NeurIPS 2022 and JMLR 2024☆22Updated 9 months ago
- ☆54Updated 10 months ago
- Implementation of the skill discovery algorithm described in ICLR submission "Option Discovery using Deep Skill Chaining"☆28Updated 5 years ago
- Implementation of VALOR (Variational Option Discovery Algorithms)☆10Updated 5 years ago