CSKrishna / Optimal-bidding-policy-using-Policy-Gradient-in-a-Multi-agent-Contextual-Bandit-setting

We use policy gradient to help agents learn optimal policies in a competitive multi-agent contextual bandit setting
11Updated 6 years ago

Alternatives and similar repositories for Optimal-bidding-policy-using-Policy-Gradient-in-a-Multi-agent-Contextual-Bandit-setting:

Users that are interested in Optimal-bidding-policy-using-Policy-Gradient-in-a-Multi-agent-Contextual-Bandit-setting are comparing it to the libraries listed below