openpsi-project / srl
A Really Scalable RL Framework to 10k+ CPUs
☆15Updated 6 months ago
Related projects: ⓘ
- SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores☆13Updated 4 months ago
- Code accompanying the paper "TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play" (AAMAS 2023) 足球游戏智能体☆47Updated last year
- ☆13Updated this week
- Launch programs on multiple hosts. (多机启动程序)☆14Updated last year
- PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms☆15Updated last week
- ☆86Updated 2 years ago
- RLA is a tool for managing your RL experiments automatically☆70Updated last year
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆54Updated 2 months ago
- Benchmarked implementations of Offline RL Algorithms.☆62Updated 4 months ago
- Extreme Q-Learning: Max Entropy RL without Entropy☆78Updated last year
- ☆11Updated 4 months ago
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆102Updated 3 weeks ago
- A curated list of awesome memory in reinforcement learning research materials☆18Updated 3 years ago
- Source code for the paper "Divergence-Augmented Policy Optimization"☆37Updated 4 years ago
- [ICLR 2023 Oral] The official implementation of SQL and EQL in "Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Reg…☆41Updated last year
- Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…☆30Updated 6 months ago
- Implementation of the Off Belief Learning algorithm.☆44Updated 2 years ago
- official implementation for our paper Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning☆68Updated last month
- The Official Code for Offline Model-based Adaptable Policy Learning (NeurIPS'21 & TPAMI)☆22Updated 8 months ago
- [NeurIPS 2022 Oral] The official implementation of POR in "A Policy-Guided Imitation Approach for Offline Reinforcement Learning"☆54Updated last year
- This is the source code of RPG (Reward-Randomized Policy Gradient)☆43Updated 2 years ago
- Official codebase for Exact Energy-Guided Diffusion Sampling via Contrastive Energy Prediction (ICML 2023)☆37Updated last year
- Python interface for accessing the near real-world offline reinforcement learning (NeoRL) benchmark datasets☆103Updated last year
- a simple and scalable agent for training adaptive policies with sequence-based RL☆79Updated this week
- Implementation of Multi-Game Decision Transformers in PyTorch☆42Updated last year
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆49Updated 8 months ago
- Official code repository for Prompt-DT.☆93Updated 2 years ago
- Code for "Unsupervised Zero-Shot RL via Functional Reward Representations"☆51Updated 5 months ago
- ☆20Updated 11 months ago
- ☆21Updated 8 months ago