zafarali/policy-gradient-methods

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zafarali/policy-gradient-methods)

zafarali / policy-gradient-methods

Modular PyTorch implementation of policy gradient methods

☆24

Alternatives and similar repositories for policy-gradient-methods

Users that are interested in policy-gradient-methods are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

google-research / policy-learning-landscape
View on GitHub
Explore the optimization landscape for direct policy learning reinforcement learning.
☆51Jan 16, 2019Updated 7 years ago
Mizux / bazel-pybind11
View on GitHub
Bazel C++ Pybind11 Sample
☆12Updated this week
UKPLab / acl2019-GPPL-humour-metaphor
View on GitHub
☆14Sep 30, 2022Updated 3 years ago
JuliaPOMDP / TabularTDLearning.jl
View on GitHub
Julia implementations of temporal difference Reinforcement Learning algorithms like Q-Learning and SARSA
☆12Nov 16, 2025Updated 8 months ago
LorenzoAusiello / Multi-Sources-Quantile-Regression-Neural-Network-in-QWIM
View on GitHub
This project presents the application of a MS-QRNN model designed to estimate Value at Risk accurately by integrating both numerical fin…
☆12May 15, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
tianbingsz / SVRG
View on GitHub
Stochastic Variance Reduction Policy Gradient Estimation
☆11Nov 6, 2018Updated 7 years ago
jinnaiyuu / Optimal-Options-ICML-2019
View on GitHub
Code for generating options for planning and reinforcement learning
☆12Feb 18, 2021Updated 5 years ago
shiwj16 / raa-drl
View on GitHub
☆11Apr 20, 2021Updated 5 years ago
epfml / quasi-global-momentum
View on GitHub
☆11Dec 23, 2022Updated 3 years ago
Nardien / NMG
View on GitHub
Official Code Repository for the paper "Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation …
☆20Jun 19, 2023Updated 3 years ago
igsor / HDPy
View on GitHub
Heuristic Dynamic Programming with Python
☆14Jul 28, 2014Updated 11 years ago
YU-NLPLab / DeepMet
View on GitHub
☆18May 26, 2020Updated 6 years ago
mewmew / float
View on GitHub
Binary floating-point formats in Go (IEEE 754 half and quadruple precision, x86 extended precision and PowerPC quadruple precision with d…
☆23Dec 12, 2021Updated 4 years ago
cyoon1729 / Policy-Gradient-Methods
View on GitHub
Implementation of Algorithms from the Policy Gradient Family. Currently includes: A2C, A3C, DDPG, TD3, SAC
☆100Jul 23, 2019Updated 7 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
kuc2477 / pytorch-splitnet
View on GitHub
PyTorch implementation of ICML 2017 paper, SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Paral…
☆17Oct 24, 2017Updated 8 years ago
facebookresearch / slbo
View on GitHub
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
☆94Sep 13, 2019Updated 6 years ago
fbora / tic-tac-GO_ZERO
View on GitHub
Implementation of Alpha Go Zero algorithm for the game of tic-tac-toe
☆16Nov 4, 2017Updated 8 years ago
mohakbhardwaj / SaIL
View on GitHub
☆17May 16, 2018Updated 8 years ago
sidgairo18 / unsupervised-style-learning
View on GitHub
This repository contains the source code, models and data files for the work titled: "Unsupervised Image Style Embeddings for Retrieval a…
☆13May 29, 2021Updated 5 years ago
hongyanz / Stackelberg-GAN
View on GitHub
Codes for Stackelberg GAN
☆15Apr 23, 2019Updated 7 years ago
kyunghyuncho / backprop-kalman-filter
View on GitHub
☆45Nov 3, 2019Updated 6 years ago
yashpatel5400 / neuropath
View on GitHub
A neural branch predictor tested using CPU emulator, testing both supervised learning and reinforcement learning (for COS 583: Great Mome…
☆15May 17, 2017Updated 9 years ago
cxxgtxy / POP3D
View on GitHub
Policy Optimization with Penalized Point Probability Distance: an Alternative to Proximal Policy Optimization
☆44Nov 8, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
andyliu42 / Counterfactual_Regret_Minimization_Python
View on GitHub
Counterfactual Regret Minimization (CFR) sample code in Python
☆14Apr 16, 2019Updated 7 years ago
sinagolara / VRP
View on GitHub
Adaptive Heuristic Method Based on SA and LNS for Solving Vehicle Routing Problem
☆13Oct 9, 2017Updated 8 years ago
khainb / CSW
View on GitHub
A novel variant of sliced Wasserstein based on a new slicing technique that utilizes the convolution operator.
☆12Jan 14, 2023Updated 3 years ago
shagunsodhani / torch-template
View on GitHub
Boiler plate code for Torch based ML projects
☆10Jul 14, 2021Updated 5 years ago
ermongroup / best-arm-delayed
View on GitHub
Code for "Best arm identification in multi-armed bandits with delayed feedback", AISTATS 2018.
☆20Apr 3, 2018Updated 8 years ago
RobRomijnders / bandit
View on GitHub
Implementation of Counterfactual risk minimization
☆26Apr 13, 2017Updated 9 years ago
facebookresearch / reward-estimator-corl
View on GitHub
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
☆23Oct 26, 2018Updated 7 years ago
wangbx66 / differentially-private-q-learning
View on GitHub
☆13May 16, 2019Updated 7 years ago
ikostrikov / pytorch-trpo
View on GitHub
PyTorch implementation of Trust Region Policy Optimization
☆448Sep 13, 2018Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lbarazza / VPG-PyTorch
View on GitHub
Minimalistic implementation of Vanilla Policy Gradient with PyTorch
☆18Jun 18, 2019Updated 7 years ago
OpenMined / writing
View on GitHub
☆12Jan 10, 2023Updated 3 years ago
iesl / leopard
View on GitHub
☆24Nov 27, 2020Updated 5 years ago
david1309 / Multi_Task_RL
View on GitHub
Project exploring Multi Task Deep Reinforcement Learning neural network architectures and algorithms with Open AI Gym and TensorFlow
☆17Sep 5, 2018Updated 7 years ago
gabrielgarza / openai-gym-policy-gradient
View on GitHub
Reinforcement Learning using Policy Gradient to solve OpenAI Gym games
☆112Dec 13, 2017Updated 8 years ago
JohnnyYeeee / math_prog_synth_env
View on GitHub
☆13Jul 22, 2021Updated 5 years ago
yenchenlin / evf-public
View on GitHub
Experience-embedded Visual Foresight, CoRL 2019
☆14Nov 13, 2019Updated 6 years ago