ravi-lanka-4 / CoPiEr
Co-training for Policy Learning
☆13Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for CoPiEr
- ☆17Updated 3 years ago
- Source code for "Multi-objective Model-based Policy Search for Data-efficient Learning with Sparse Rewards" (CoRL 2018)☆13Updated 6 years ago
- This is the source code for solving the Traveling Salesman Problems (TSP) using Monte Carlo tree search (MCTS).☆29Updated 5 years ago
- PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning☆49Updated 3 years ago
- (ICLR 2021) Learning to Represent Action Values as a Hypergraph on the Action Vertices☆21Updated 3 years ago
- ☆16Updated 6 years ago
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 4 years ago
- Self-implemented code for Model-Based Meta-Reinforcement Learning☆17Updated 5 years ago
- The implementation of Discriminator Soft Actor Critic☆14Updated 4 years ago
- 🧶 Minimal PyTorch Soft Actor Critic (SAC) implementation☆36Updated 2 years ago
- Code for Multi-Agent Common Knowledge Reinforcement Learning (NeurIPS 2019)☆33Updated 4 years ago
- Code for "Calibrated Model-Based Deep Reinforcement Learning", ICML 2019.☆55Updated 5 years ago
- A comparison of parameter space noise methods for exploration in deep reinforcement learning☆27Updated 5 years ago
- Implementation for ICML 2019 paper, EMI: Exploration with Mutual Information.☆36Updated 3 years ago
- ICRL 2020☆18Updated 4 years ago
- ☆18Updated 3 years ago
- IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation☆36Updated 3 weeks ago
- [ICLR 2020, Oral] Harnessing Structures for Value-Based Planning and Reinforcement Learning☆33Updated 4 years ago
- 🔍 Codebase for the ICML '20 paper "Ready Policy One: World Building Through Active Learning" (arxiv: 2002.02693)☆18Updated last year
- Implementation of Population-Guided Parallel Policy Search for Reinforcement Learning☆22Updated 4 years ago
- Experiment for Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning☆24Updated last year
- (Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards☆26Updated 5 years ago
- Author's PyTorch Implementation of Deep Homomorphic Policy Gradient (DHPG) - NeurIPS 2022 and JMLR 2024☆22Updated 7 months ago
- Variational Reinforcement Learning☆16Updated 3 months ago
- Code accompanying the paper "Information Directed Reward Learning for Reinforcement Learning" (NeurIPS 2021).☆13Updated 3 years ago
- Code for paper "Model-based Adversarial Meta-Reinforcement Learning" (https://arxiv.org/abs/2006.08875)