stephen-chung-mh / thinker
Thinker project
☆11Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for thinker
- ☆34Updated last year
- Docker containers of baseline agents for the Crafter environment☆28Updated 2 years ago
- [ICLR 2024] Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.☆21Updated 7 months ago
- Code for magnetic mirror descent.☆15Updated last year
- Jaxplorer is a Jax reinforcement learning (RL) framework for exploring new ideas.☆12Updated 4 months ago
- General Modules for JAX☆59Updated 3 months ago
- Learning diverse options through the Laplacian representation.☆22Updated 10 months ago
- ☆15Updated 7 months ago
- CREATE Environment for long-horizon physics-puzzle tasks with diverse tools☆17Updated 2 years ago
- Official codebase for Generating Diverse Cooperative Agents by Learning Incompatible Policies (notable-top-25% @ ICLR 2023)☆14Updated 6 months ago
- ☆38Updated last year
- ☆29Updated 8 months ago
- Simple JAX Graphics Library.☆23Updated 2 weeks ago
- ☆23Updated 2 years ago
- Highly scalable 2D JAX physics engine.☆35Updated last week
- Code accompanying the paper "TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play" (AAMAS 2023) 足球游戏智能体☆13Updated last year
- Code and data for the paper "Bridging RL Theory and Practice with the Effective Horizon"☆42Updated 4 months ago
- EARL: Environment for Autonomous Reinforcement Learning☆34Updated last year
- Scalable Opponent Shaping Experiments in JAX☆21Updated 7 months ago
- On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning☆17Updated last year
- Code for Model-Free Opponent Shaping (ICML 2022)☆17Updated 2 years ago
- Corax: Core RL in JAX☆35Updated 9 months ago
- Jax-Baseline is a Reinforcement Learning implementation using JAX and Flax/Haiku libraries, mirroring the functionality of Stable-Baselin…☆40Updated last week
- An Open-Ended Agentic Simulator☆28Updated 3 months ago
- ☆63Updated 3 months ago
- Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy☆15Updated 3 weeks ago
- ☆29Updated 3 years ago
- Accompanying Code for "Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning", ICML 2023☆18Updated 10 months ago
- Efficient seed-parallel implementation of "Breaking the Replay Ratio Barrier"☆21Updated last year
- ☆28Updated 3 years ago