peterljq / Parsimonious-Concept-Engineering

Parsimonious Concept Engineering (PaCE) uses sparse coding on a large-scale concept dictionary to effectively improve the trustworthiness of Large Language Models by precisely controlling and modifying their neural activations.
25Updated 3 months ago

Related projects: