IBM / constrained-rl
Constrained Exploration and Recovery from Experience Shaping
☆21Updated 5 years ago
Related projects ⓘ
Alternatives and complementary repositories for constrained-rl
- Code accompanying the paper "Action Robust Reinforcement Learning and Applications in Continuous Control" https://arxiv.org/abs/1901.0918…☆41Updated 5 years ago
- We investigate the effect of populations on finding good solutions to the robust MDP☆28Updated 3 years ago
- Learning Off-Policy with Online Planning [CoRL 2021 Best Paper Finalist]☆34Updated 2 years ago
- Safe Model-based Reinforcement Learning with Robust Cross-Entropy Method☆62Updated last year
- Toolkit of Causal Model-based Reinforcement Learning.☆32Updated last year
- ☆21Updated 7 months ago
- Open source demo for the paper Learning to Score Behaviors for Guided Policy Optimization☆24Updated 4 years ago
- Code for Latent Action Space for Offline Reinforcement Learning [CoRL 2020]☆48Updated 3 years ago
- on-policy optimization baselines for deep reinforcement learning☆28Updated 4 years ago
- IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation☆36Updated 3 weeks ago
- A curated list of awesome Model-based reinforcement learning resources☆90Updated 4 years ago
- This repo is the implementation of paper ''SHAQ: Incorporating Shapley Value Theory into Multi-Agent Q-Learning''.☆41Updated 11 months ago
- TensorFlow implementation of "A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Le…☆15Updated 2 years ago
- Implementation of the Model-Based Meta-Policy-Optimization (MB-MPO) algorithm☆44Updated 6 years ago
- DecentralizedLearning☆21Updated last year
- Code for the paper "AlwaysSafe: Reinforcement Learning Without Safety Constraint Violations During Training"☆18Updated 2 years ago
- PyTorch implementation of our paper Reinforcement Learning with Random Delays (ICLR 2020)☆39Updated 2 years ago
- PyTorch IMPALA implementation☆24Updated 5 years ago
- Safe Policy Improvement with Baseline Bootstrapping☆25Updated 4 years ago
- ☆71Updated 5 months ago
- (Experimental) Inverse reinforcement learning from trajectories generated by multiple agents with different (but correlated) rewards☆26Updated 5 years ago
- ☆97Updated last year
- Code for the NeurIPS 2021 paper "Safe Reinforcement Learning by Imagining the Near Future"☆39Updated 2 years ago
- ☆67Updated 4 years ago
- Codes accompanying the paper "DOP: Off-Policy Multi-Agent Decomposed Policy Gradients" (ICLR 2021, https://arxiv.org/abs/2007.12322)☆52Updated last year
- Implementations of SAILR, PDO, and CSC☆31Updated 4 months ago
- IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL☆35Updated 2 months ago
- Offline Risk-Averse Actor-Critic (O-RAAC). A model-free RL algorithm for risk-averse RL in a fully offline setting☆33Updated 3 years ago
- Implementation of Population-Guided Parallel Policy Search for Reinforcement Learning☆22Updated 4 years ago
- ☆30Updated last year