robintyh1/icml2021-pengqlambda

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/robintyh1/icml2021-pengqlambda)

robintyh1 / icml2021-pengqlambda

Revisiting Peng's Q(lambda) for Modern Reinforcement Learning

☆15

Alternatives and similar repositories for icml2021-pengqlambda

Users that are interested in icml2021-pengqlambda are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

philipjball / OffCon3
View on GitHub
📴 OffCon^3: SOTA PyTorch SAC and TD3 Implementations (arxiv: 2101.11331)
☆25Jun 20, 2021Updated 5 years ago
abbyvansoest / maxent
View on GitHub
☆14May 30, 2019Updated 7 years ago
tgangwani / GuidanceRewards
View on GitHub
Pytorch code for "Learning Guidance Rewards with Trajectory-space Smoothing" (NeurIPS 2020)
☆12Jul 7, 2021Updated 5 years ago
frt03 / jax_dt
View on GitHub
Minimal Decision Transformer Implementation written in Jax (Flax).
☆18Aug 8, 2022Updated 3 years ago
frt03 / mxt_bench
View on GitHub
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation (ICLR2023)
☆14Feb 3, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
rlseminar / rlseminar.github.io
View on GitHub
Reinforcement Learning Seminar at the Chinese University of Hong Kong, Shenzhen, China.
☆21Nov 17, 2023Updated 2 years ago
juliuskunze / cwvae-jax
View on GitHub
Clockwork VAEs in JAX/Flax
☆32Jul 16, 2021Updated 5 years ago
mklissa / PPOC
View on GitHub
Proximal Policy Option-Critic
☆26Jan 4, 2019Updated 7 years ago
mklissa / phi_gcn
View on GitHub
Reward Propagation using Graph Convolutional Networks
☆13Jun 19, 2021Updated 5 years ago
hari-sikchi / LOOP
View on GitHub
Learning Off-Policy with Online Planning [CoRL 2021 Best Paper Finalist]
☆42Aug 27, 2022Updated 3 years ago
russellmendonca / mier_public
View on GitHub
☆13Mar 16, 2023Updated 3 years ago
montrealrobotics / iv_rl
View on GitHub
IV-RL - Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
☆40Jul 18, 2025Updated last year
apple / ml-uwac
View on GitHub
☆35Jul 10, 2021Updated 5 years ago
instadeepai / fastpbrl
View on GitHub
Vectorization techniques for fast population-based training.
☆57Apr 26, 2026Updated 2 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
NagisaZj / MetaCURE-Public
View on GitHub
☆15Apr 5, 2023Updated 3 years ago
astanic / crafter-ood
View on GitHub
☆19Nov 25, 2022Updated 3 years ago
microsoft / segar
View on GitHub
Sandbox environment for generalizable agent research
☆27Aug 19, 2022Updated 3 years ago
bstadie / krazyworld
View on GitHub
krazy grid world
☆26Mar 2, 2020Updated 6 years ago
kvfrans / rlbase_stable
View on GitHub
☆46Jul 12, 2024Updated 2 years ago
ikostrikov / dmcgym
View on GitHub
☆23Aug 19, 2022Updated 3 years ago
illidanlab / rpg
View on GitHub
Ranking Policy Gradient
☆23Nov 27, 2019Updated 6 years ago
google-deepmind / affordances_option_models
View on GitHub
☆22Nov 8, 2021Updated 4 years ago
Farama-Foundation / minari-dataset-generation-scripts
View on GitHub
Scripts to recreate the D4RL datasets with Minari
☆26Jul 4, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
DavidJanz / successor_uncertainties_atari
View on GitHub
Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…
☆21Feb 24, 2023Updated 3 years ago
jurgisp / memory-maze
View on GitHub
Evaluating long-term memory of reinforcement learning algorithms
☆180Jun 23, 2023Updated 3 years ago
htdt / lwm
View on GitHub
Latent World Models For Intrinsically Motivated Exploration | Official repository
☆23Apr 28, 2021Updated 5 years ago
proceduralia / high_replay_ratio_continuous_control
View on GitHub
Efficient seed-parallel implementation of "Breaking the Replay Ratio Barrier"
☆28May 22, 2023Updated 3 years ago
google-deepmind / zipfian_environments
View on GitHub
☆28Jul 28, 2022Updated 3 years ago
Victorwz / LaViA
View on GitHub
☆10Jul 13, 2024Updated 2 years ago
keynans / HypeRL
View on GitHub
Authors' PyTorch implementation of 'Recomposing the Reinforcement Learning Building-Blocks with Hypernetworks' (HypeRL)
☆26Jun 9, 2021Updated 5 years ago
danijar / embodied
View on GitHub
Fast reinforcement learning research
☆65May 25, 2026Updated last month
geyang / e-maml
View on GitHub
E-MAML, and RL-MAML baseline implemented in Tensorflow v1
☆17Dec 7, 2019Updated 6 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
floringogianu / atari-agents
View on GitHub
Code and links for over 25,000 trained Atari agents
☆100Aug 22, 2024Updated last year
Santara / stochastic_value_gradient
View on GitHub
Implementation of (Learning Continuous Control Policies by Stochastic Value Gradients)[https://arxiv.org/abs/1510.09142]
☆25Jan 15, 2022Updated 4 years ago
liziniu / policy_optimization
View on GitHub
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆29Dec 19, 2023Updated 2 years ago
schmidtdominik / Rainbow
View on GitHub
Rainbow DQN implementation accompanying the paper "Fast and Data-Efficient Training of Rainbow" which reaches 205.7 median HNS after 10M …
☆44Dec 11, 2021Updated 4 years ago
nnaisense / MAGE
View on GitHub
Learning Action-Value Gradients in Model-based Policy Optimization
☆32Sep 7, 2021Updated 4 years ago
qlan3 / Explorer
View on GitHub
Explorer is a PyTorch reinforcement learning framework for exploring new ideas.
☆98Updated this week
epignatelli / discovering-reinforcement-learning-algorithms
View on GitHub
A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. a…
☆23Dec 22, 2020Updated 5 years ago