josh-ashkinaze / plurals
Plurals: A System for Guiding LLMs Via Simulated Social Ensembles
☆15Updated 3 weeks ago
Alternatives and similar repositories for plurals:
Users that are interested in plurals are comparing it to the libraries listed below
- The Prism Alignment Project☆62Updated 8 months ago
- ☆90Updated 7 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆66Updated last year
- Factored Cognition Primer: How to write compositional language model programs☆48Updated last year
- ☆81Updated 3 months ago
- ☆65Updated 9 months ago
- ReactJS library for "Cells, Generators, and Lenses": object-oriented UI components to compose LLM-powered writing interfaces that support…☆17Updated last year
- A toolkit for describing model features and intervening on those features to steer behavior.☆151Updated 2 months ago
- ☆42Updated 8 months ago
- A mechanistic approach for understanding and detecting factual errors of large language models.☆39Updated 6 months ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆82Updated 5 months ago
- ☆100Updated 8 months ago
- LLM experiments done during SERI MATS - focusing on activation steering / interpreting activation spaces☆85Updated last year
- Governance of the Commons Simulation (GovSim)☆31Updated this week
- Improving Alignment and Robustness with Circuit Breakers☆175Updated 3 months ago
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆40Updated last month
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆64Updated 7 months ago
- Concept Induction: Analyzing Unstructured Text with High-Level Concepts Using LLooM (CHI 2024 paper). LLooM automatically surfaces high-l…☆71Updated last month
- Evaluating the Moral Beliefs Encoded in LLMs☆23Updated last month
- PAIR.withgoogle.com and friend's work on interpretability methods☆162Updated last month
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆46Updated 7 months ago
- Functional Benchmarks and the Reasoning Gap☆82Updated 3 months ago
- Repo for the paper "Detecting Logical Fallacies: From Quiz to Climate Change News" (2021)☆70Updated last year
- ☆125Updated 2 months ago
- ☆31Updated 3 months ago
- ☆29Updated last year
- ☆25Updated 9 months ago
- Data and code for the Corr2Cause paper (ICLR 2024)☆91Updated 9 months ago
- ☆37Updated 2 months ago
- ☆41Updated this week