openpsi-project / srlLinks
A Really Scalable RL Framework to 10k+ CPUs
☆33Updated last year
Alternatives and similar repositories for srl
Users that are interested in srl are comparing it to the libraries listed below
Sorting:
- A distributed GPU-centric experience replay system for large AI models.☆18Updated last year
- SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores☆15Updated last year
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆48Updated 7 months ago
- A high-performance, scalable MindSpore reinforcement learning framework.☆50Updated 11 months ago
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆113Updated 10 months ago
- Launch programs on multiple hosts. (多机启动程序)☆14Updated last year
- Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024…☆36Updated 7 months ago
- ☆30Updated last year
- ☆30Updated 2 years ago
- Minimal RLHF implementation built on top of minGPT.☆29Updated 11 months ago
- Code accompanying the paper "TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play" (AAMAS 2023) 足球游戏智能体☆58Updated last year
- ☆89Updated 2 years ago
- RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads☆44Updated 4 years ago
- Extreme Q-Learning: Max Entropy RL without Entropy☆88Updated 2 years ago
- ☆43Updated this week
- Distributed DRL by Ray and TensorFlow Tutorial.☆10Updated 5 years ago
- MR.Q is a general-purpose model-free reinforcement learning algorithm.☆104Updated 2 months ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆58Updated 8 months ago
- Baselines for Neural MMO -- new users should treat this repo as a starter project☆47Updated 10 months ago
- Allow torch tensor memory to be released and resumed later☆40Updated last week
- ☆40Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated last year
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆55Updated 11 months ago
- Learn online intrinsic rewards from LLM feedback☆41Updated 6 months ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆304Updated 2 months ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆57Updated 2 months ago
- RLA is a tool for managing your RL experiments automatically☆71Updated 2 years ago
- ☆49Updated last month
- Re-implementations of SOTA RL algorithms.☆133Updated last year
- Official code repository for Prompt-DT.☆112Updated 2 years ago