SamsungLabs / tqcLinks

Implementation of Truncated Quantile Critics method for continuous reinforcement learning.

☆25

Alternatives and similar repositories for tqc

Users that are interested in tqc are comparing it to the libraries listed below

Sorting:

keynans / HypeRL
Authors' PyTorch implementation of 'Recomposing the Reinforcement Learning Building-Blocks with Hypernetworks' (HypeRL)
☆25Updated 3 years ago
tedmoskovitz / TOP
Implementation of Tactical Optimistic and Pessimistic value estimation
☆26Updated last year
rraileanu / idaac
☆53Updated last year
YYCAAA / V-MPO_Lunarlander
Simple implementation of V-MPO proposed in https://arxiv.org/abs/1909.12238
☆47Updated 4 years ago
younggyoseo / CaDM
CaDM: Context-aware Dynamics Model for Generalization in Model-based Reinforcement Learning
☆63Updated 5 years ago
danijar / crafter-baselines
Docker containers of baseline agents for the Crafter environment
☆28Updated 3 years ago
montrealrobotics / iv_rl
IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
☆40Updated 7 months ago
ahmed-touati / controllable_agent
☆45Updated 2 years ago
evgenii-nikishin / rl_with_resets
JAX implementation of deep RL agents with resets from the paper "The Primacy Bias in Deep Reinforcement Learning"
☆100Updated 3 years ago
frt03 / generalized_dt
Generalized Decision Transformer for Offline Hindsight Information Matching (ICLR2022)
☆67Updated 2 years ago
RajGhugare19 / alm
Simplifying Model-based RL: Learning Representations, Latent-space Models and Policies with One Objective
☆80Updated 2 years ago
sahandrez / homomorphic_policy_gradient
Author's PyTorch Implementation of Deep Homomorphic Policy Gradient (DHPG) - NeurIPS 2022 and JMLR 2024
☆23Updated last year
eric-mitchell / macaw-min
Clean, extensible implementation of MACAW [ICML 2021]
☆13Updated 3 years ago
uoe-agents / derl
The official repository of Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration" (AAMAS 2022)
☆27Updated 3 years ago
Howuhh / sac-n-jax
Single-file SAC-N implementation on jax with flax and equinox. 10x faster than pytorch
☆52Updated 2 years ago
proceduralia / high_replay_ratio_continuous_control
Efficient seed-parallel implementation of "Breaking the Replay Ratio Barrier"
☆24Updated 2 years ago
twni2016 / Memory-RL
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment, NeurIPS 2023 (oral)
☆62Updated last year
facebookresearch / level-replay
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the …
☆86Updated 3 years ago
nnaisense / MAGE
Learning Action-Value Gradients in Model-based Policy Optimization
☆31Updated 3 years ago
sparisi / cbet
Change-Based Exploration Transfer
☆36Updated 3 years ago
yardenas / la-mbda
LAMBDA is a model-based reinforcement learning agent that uses Bayesian world models for safe policy optimization
☆33Updated 2 years ago
Ji4chenLi / Multi-Task-Batch-RL
☆26Updated 2 years ago
architsharma97 / earl_benchmark
EARL: Environment for Autonomous Reinforcement Learning
☆37Updated 2 years ago
RajGhugare19 / stitching-is-combinatorial-generalisation
[ICLR 2024] Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.
☆23Updated last year
uber-research / D3G
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
☆32Updated 5 years ago
manantomar / Mirror-Descent-Policy-Optimization
Mirror Descent Policy Optimization
☆38Updated 4 years ago
young-geng / JaxCQL
Conservative Q learning in Jax
☆54Updated 2 years ago
cassidylaidlaw / effective-horizon
Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"
☆48Updated 11 months ago
clvrai / new-actions-rl
☆24Updated 9 months ago
micahcarroll / uniMASK
Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"
☆55Updated 11 months ago