针对最经典的表格型Q learning算法进行了复现,能够支持gym中大多数的离散动作和状态空间的环境,譬如CliffWalking-v0。
☆10Jan 2, 2021Updated 5 years ago
Alternatives and similar repositories for Q-learning
Users that are interested in Q-learning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆39Jan 16, 2026Updated 2 months ago
- Implementations of Influential Recommender System☆11Oct 29, 2024Updated last year
- The code of paper LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence. Zhihao Shi, Xize Liang, Jie Wang. ICLR 2023…☆47Feb 15, 2023Updated 3 years ago
- Python 高级编程☆15Dec 18, 2019Updated 6 years ago
- Langchain Agent finetuning using 7B - LLAMA 2 , on hotpotQA (Retroformer framework)☆16Sep 5, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- The code of paper *Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization*.☆18Mar 26, 2022Updated 3 years ago
- Analyse Social Network of co-authors in DBLP website (https://dblp.uni-trier.de) using NetworkX.☆14May 27, 2020Updated 5 years ago
- An ASCII Header Generator for Network Protocols☆14Dec 12, 2024Updated last year
- ☆17Nov 3, 2024Updated last year
- A solutions manual for Set Theory by Thomas Jech☆13Aug 12, 2018Updated 7 years ago
- ☆11Jan 6, 2024Updated 2 years ago
- A novel template-free retrosynthesizer that can generate diverse sets of reactants for a desired product via discrete conditional variati…☆15Aug 7, 2022Updated 3 years ago
- Executive control code for STRANDS robots.☆11Feb 13, 2020Updated 6 years ago
- Must-read papers on Knowledge Graph Reasoning (KGR)☆21Mar 16, 2020Updated 6 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Neural theorem proving evaluation via the Lean REPL☆23Jul 12, 2025Updated 8 months ago
- Code and data for paper named: Large language models for automatic equation discovery of nonlinear dynamics☆13Mar 6, 2025Updated last year
- ☆12Oct 5, 2021Updated 4 years ago
- Implementation of the paper Unsupervised Domain Adaptation by Backpropagation☆10Dec 1, 2018Updated 7 years ago
- ☆18Jan 26, 2024Updated 2 years ago
- [ICML2025] Official codebase for "TeLoGraF: Temporal Logic Planning via Graph-encoded Flow Matching"☆20Jul 14, 2025Updated 8 months ago
- Official code for "From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation" (ICLR2026)☆31Mar 1, 2026Updated 3 weeks ago
- ☆16Feb 17, 2025Updated last year
- ☆21Feb 21, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- OpenAI gym environments for goal-conditioned and language-conditioned reinforcement learning☆14Jan 27, 2026Updated last month
- ☆10Jun 7, 2021Updated 4 years ago
- ☆30Dec 27, 2024Updated last year
- ☆13Jul 7, 2024Updated last year
- [T-RO] Python implementation of PRobabilistically-Informed Motion Primitives (PRIMP)☆12Apr 19, 2024Updated last year
- 最基本的基于蒙特卡洛搜索树(MCTS)的五子棋。☆14Apr 8, 2021Updated 4 years ago
- PhysReason Becnhmark☆19Jul 8, 2025Updated 8 months ago
- ☆11Jul 1, 2024Updated last year
- Must-read papers on Knowledge Graph Embedding☆29Oct 15, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆40May 2, 2024Updated last year
- Official Code Repository for the POLICEd-RL Paper: https://www.roboticsproceedings.org/rss20/p104.html☆13Mar 4, 2025Updated last year
- Code for Policy Bifurcation in Safe Reinforcement Learning☆10Jul 4, 2025Updated 8 months ago
- [NeurIPS 2024] Official code for "Variational Distillation of Diffusion Policies into Mixture of Experts"☆17Dec 7, 2024Updated last year
- A molecule generative model used interaction fingerprint (docking pose) as constraints.☆15Feb 13, 2022Updated 4 years ago
- ☆36Dec 26, 2022Updated 3 years ago
- Android一些我看到的开源项目☆20Sep 11, 2017Updated 8 years ago