HumanCompatibleAI / interpreting-rewardsLinks

Experiments in applying interpretability techniques to learned reward functions.
10Updated 4 years ago

Alternatives and similar repositories for interpreting-rewards

Users that are interested in interpreting-rewards are comparing it to the libraries listed below

Sorting: