本课程主要介绍强化学习的基础知识,其目标是帮助同学们快速、顺利地进入强化学习及其应用领域的研究工作。课程主要内容包含有限马尔可夫决策过程,动态规划,无模型预测与控制(SASA,Q-Learning),价值函数逼近(DQN),策略梯度方法(REINFORCE),执行者/评论者方法(AC,TRPO,PPO),连续动作空间的确定性策略(DDPG)。
☆17Oct 17, 2022Updated 3 years ago
Alternatives and similar repositories for A05_rl
Users that are interested in A05_rl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 中国法律法规在线文库(超过两千件),原始文件来自国家法律法规数据库 https://flk.npc.gov.cn/index☆43Updated this week
- QuickSplat: Fast 3D Surface Reconstruction via Learned Gaussian Initialization☆21Nov 11, 2025Updated 5 months ago
- 利用遗传算法做基于客流需求的列车时刻表的优化☆15Apr 25, 2021Updated 5 years ago
- ☆10Jun 13, 2023Updated 2 years ago
- ☆10Jul 13, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 画出列车运行图,给出列车运行的最佳调度☆15Mar 9, 2020Updated 6 years ago
- Transport video using ROS image_transport to eliminate latency.☆20May 11, 2023Updated 2 years ago
- ☆47Mar 29, 2026Updated last month
- ☆45Oct 29, 2025Updated 6 months ago
- Some notes about reinforce learning, self-driving cars and leetcode☆21Mar 26, 2022Updated 4 years ago
- Implementation of the TD3 algorithm written in Pytorch☆12Dec 8, 2022Updated 3 years ago
- Markdown 语法文档 整理与修缮☆13Jun 25, 2019Updated 6 years ago
- ☆11Apr 16, 2023Updated 3 years ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆26Feb 11, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- yolo26-plate 车牌检测 车牌识别 中文车牌识别 检测 支持12种中文车牌 支持双层车牌☆56Feb 13, 2026Updated 2 months ago
- 基于 MoveIt2 的手眼标定(Hand-Eye Calibration)软件☆46Aug 16, 2025Updated 8 months ago
- 基于迁移学习的离心泵滚动轴承故障自动识别方法研究☆20May 29, 2020Updated 5 years ago
- ☆22Jan 8, 2020Updated 6 years ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 4 months ago
- 这是高华的部分。列车运行图综合运用系统☆20Dec 8, 2022Updated 3 years ago
- A simple C++ Multi-file VSCode project template based on Makefile.☆16Oct 26, 2021Updated 4 years ago
- 基于强化学习的炼钢动态调度求解技术和软件实现☆25Apr 26, 2020Updated 6 years ago
- Standardized compatibility layer for operating systems and peripheral devices written in C++.☆42Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A Keras-based and TensorFlow-backend NLP Models Toolkit.☆12Jul 7, 2022Updated 3 years ago
- Reproduce the paper Distributed Representations of Sentences and Documents in tensorflow☆14Apr 8, 2017Updated 9 years ago
- Attentional Neural Network that translates text to phones.☆11Jan 25, 2018Updated 8 years ago
- Domain Adaptive Neural Networks with DJP-MMD☆20Sep 22, 2021Updated 4 years ago
- A seq2seq with attention dialogue/MT model implemented by TensorFlow.☆11Jul 17, 2018Updated 7 years ago
- A Reinforcement Learning Friendly Simulator for Mobile Robot☆28Apr 27, 2025Updated last year
- 中文文本的向量表示方法(Sentence-BERT, CoSENT)的PyTorch简单实现,可以用于文本相似度计算。☆10Mar 27, 2022Updated 4 years ago
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 5 months ago
- Markov Chain Monte Carlo MCMC methods are implemented in various languages (including R, Python, Julia, Matlab)☆29Jun 20, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆10May 6, 2020Updated 6 years ago
- ACL Paper Lists(machine translation)☆13Mar 23, 2022Updated 4 years ago
- 电巢实训文件☆27Mar 28, 2022Updated 4 years ago
- ☆24Jul 30, 2022Updated 3 years ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆14May 8, 2022Updated 3 years ago
- ☆23Mar 26, 2025Updated last year
- Implements a lightweight workflow for Codex inspired by Recursive Language Models (MIT). Now known as 'recursive-mode'☆56Apr 10, 2026Updated 3 weeks ago