wollschlager / geometry-of-refusalLinks

Code to the paper: The Geometry of Refusal in Large Language Models: Concept Cones and Representational Independence
16Updated last month

Alternatives and similar repositories for geometry-of-refusal

Users that are interested in geometry-of-refusal are comparing it to the libraries listed below

Sorting: