KMnO4-zx / hand-on-rlLinks
☆19Updated 9 months ago
Alternatives and similar repositories for hand-on-rl
Users that are interested in hand-on-rl are comparing it to the libraries listed below
Sorting:
- ☆102Updated 11 months ago
- ☆96Updated 2 months ago
- ☆248Updated 3 months ago
- llm & rl☆198Updated last week
- ☆51Updated last year
- Build a bridge that connects beginners to deep reinforcement learning.☆11Updated 11 months ago
- ☆369Updated 6 months ago
- MinRL provides clean, minimal implementations of fundamental reinforcement learning algorithms in a customizable GridWorld environment. T…☆94Updated 3 months ago
- 📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024☆40Updated 10 months ago
- ☆98Updated last year
- ☆19Updated last year
- ☆631Updated 2 years ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆87Updated 5 months ago
- Full stack LLM (Pre-training/finetuning, PPO(RLHF), Inference, Quant, etc.)☆27Updated 6 months ago
- An reconstruction of RL Introduction and its course materials for a more efficient entry☆14Updated 3 months ago
- bilibili video course src code☆370Updated last year
- ☆214Updated 6 months ago
- An easier PyTorch deep reinforcement learning library.☆235Updated 8 months ago
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning☆82Updated 11 months ago
- ☆54Updated last year
- This is the official implementation of paper "Leveraging Dual Process Theory in Language Agent Framework for Simultaneous Human-AI Collab…☆40Updated 3 months ago
- Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.☆475Updated 11 months ago
- 本项目是自动化学报中AUTOPLAN的代码地址,使用大语言模型完成了复杂任务的任务规划以及任务执行☆105Updated 9 months ago
- Run TRex with PPO☆39Updated 3 months ago
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆75Updated 4 months ago
- 本项目将基于多模态,RAG以及LLM等技术,打造了一个基于手相算命的系统☆28Updated last year
- Not interactive deep reinforcement learning book with no-framework code, copied math, no discussions. Adopted at only -1 university(Shanh…☆24Updated last year
- personal chatgpt☆382Updated 8 months ago
- LLM大模型(重点)以及搜广推等 AI 算法中手写的面试题,(非 LeetCode),比如 Self-Attention, AUC等,一般比 LeetCode 更考察一个人的综合能力,又更贴近业务和基础知识一点☆349Updated 8 months ago
- A curated list of RL resources☆44Updated 3 weeks ago