Pytorch implementation of BEAR in "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"
☆11Oct 29, 2019Updated 6 years ago
Alternatives and similar repositories for BEAR
Users that are interested in BEAR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Jan 20, 2020Updated 6 years ago
- Controllable Multi-Objective Re-ranking with Policy Hypernetworks (KDD 2023)☆38Oct 6, 2024Updated last year
- Code to reproduce Supervised Policy Update (ICLR 2019)☆17Dec 8, 2022Updated 3 years ago
- Ranking Policy Gradient☆23Nov 27, 2019Updated 6 years ago
- [ICLR'20] Learning to Learn by Zeroth-Order Oracle☆14Feb 7, 2020Updated 6 years ago
- ☆10Aug 18, 2022Updated 3 years ago
- Codebase for ICLR 2023 paper, "SMART: Self-supervised Multi-task pretrAining with contRol Transformers"☆54Jan 26, 2024Updated 2 years ago
- Code for "Goal-Conditioned Predictive Coding for Offline Reinforcement Learning" (NeurIPS 2023)☆14Dec 8, 2023Updated 2 years ago
- This is the official implementation for IJCAI 2023 Paper: Towards Hierarchical Policy Learning for Conversational Recommendation with Hyp…☆12Sep 19, 2023Updated 2 years ago
- TensorFlow implementation for our paper "Exploration via Hindsight Goal Generation"☆23Mar 11, 2022Updated 4 years ago
- Open-source code for GEAR☆13Dec 3, 2025Updated 3 months ago
- A comparison of parameter space noise methods for exploration in deep reinforcement learning☆30Mar 14, 2019Updated 7 years ago
- An adjustive SEIR model to estimate parameters of 2019-nCoV☆19Jun 22, 2022Updated 3 years ago
- Code for the paper "Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via Online High-Confidence Change-Point Detection"☆11Aug 7, 2023Updated 2 years ago
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees☆55Jul 26, 2019Updated 6 years ago
- The official implementation of "DVR: Micro-Video Recommendation Optimizing Watch-Time-Gain under Duration Bias" (MM '22)☆18Oct 15, 2022Updated 3 years ago
- [SIGIR 2024] NFARec: A Negative Feedback-Aware Recommender Model.☆12Jan 9, 2025Updated last year
- Code for SPIBB-DQN and Soft-SPIBB-DQN☆11May 5, 2020Updated 5 years ago
- Code for our paper: Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies☆15Feb 21, 2019Updated 7 years ago
- [ICLR 2020, Oral] Harnessing Structures for Value-Based Planning and Reinforcement Learning☆34Feb 1, 2020Updated 6 years ago
- Official Implementation for Quality-Similar Diversity via Population Based Reinforcement Learning☆19Dec 26, 2025Updated 2 months ago
- Offline RL experiments☆15Oct 1, 2022Updated 3 years ago
- Codes accompanying the paper "Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling" (ICLR 2023) https://arxiv.or…☆41Oct 11, 2023Updated 2 years ago
- ☆10Aug 8, 2017Updated 8 years ago
- Code for Expert Supervised Reinforcement Learning☆10Apr 7, 2021Updated 4 years ago
- Example of android app written in Qt/Qml which uses MXNet for plant image recognition.☆10Nov 4, 2017Updated 8 years ago
- python implementation of the TPGR☆40Mar 27, 2019Updated 6 years ago
- ☆15May 24, 2021Updated 4 years ago
- Code for Stabilizing Off-Policy RL via Bootstrapping Error Reduction☆163Jul 17, 2020Updated 5 years ago
- IPython Notebooks on various things☆14Dec 4, 2017Updated 8 years ago
- Anti exploration in offline reinforcement learning☆11May 17, 2021Updated 4 years ago
- ☆11Oct 19, 2018Updated 7 years ago
- An implementation of the paper "Solving the Rubik's Cube without Human Knowledge"☆14Dec 9, 2018Updated 7 years ago
- Ant Gather and Ant Maze envs, separated from RLLab☆11Aug 2, 2018Updated 7 years ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 5 months ago
- Official Codebase for "Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control" (NeurIPS 2024)☆15Oct 29, 2024Updated last year
- 统计微信朋友圈送出的赞票与得到的赞票人员比例☆11May 3, 2016Updated 9 years ago
- ☆11Aug 10, 2020Updated 5 years ago
- An implementation of effective policy ensemble.☆16Jul 5, 2023Updated 2 years ago