Yadhavaramanan / Policy-GradientLinks

Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. In the off-policy algorithm, actions are sample using behaviour policy and separate target policy is used to optimise for.
14Updated last year

Alternatives and similar repositories for Policy-Gradient

Users that are interested in Policy-Gradient are comparing it to the libraries listed below

Sorting: