SALT-NLP / normbankLinks
Data and code for the paper "NormBank: A Knowledge Bank of Situational Social Norms"
☆33Updated 2 years ago
Alternatives and similar repositories for normbank
Users that are interested in normbank are comparing it to the libraries listed below
Sorting:
- ☆34Updated 2 years ago
- Code for the paper "CoS: Enhancing Personalization and Mitigating Bias with Context Steering"☆18Updated 10 months ago
- Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.☆24Updated 6 months ago
- LLM Agora, debating between open-source LLMs to refine the answers☆80Updated 2 years ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.☆150Updated last month
- Sotopia: an Open-ended Social Learning Environment (ICLR 2024 spotlight)☆256Updated 3 weeks ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆78Updated last year
- ☆46Updated 3 weeks ago
- [NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆66Updated 11 months ago
- ☆22Updated last year
- For OpenMOSS Mechanistic Interpretability Team's Sparse Autoencoder (SAE) research.☆157Updated last week
- This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models☆55Updated last year
- ☆179Updated last year
- ☆116Updated last year
- ☆85Updated 10 months ago
- Exploring the Limitations of Large Language Models on Multi-Hop Queries☆27Updated 7 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆181Updated 6 months ago
- Repository for the Bias Benchmark for QA dataset.☆129Updated last year
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆80Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆77Updated last year
- Inspecting and Editing Knowledge Representations in Language Models☆117Updated 2 years ago
- ☆57Updated 2 years ago
- ☆98Updated last year
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆121Updated last year
- [EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning☆49Updated last year
- Steering Llama 2 with Contrastive Activation Addition☆191Updated last year
- Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)☆125Updated 3 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆110Updated 2 years ago
- FeatureAlignment = Alignment + Mechanistic Interpretability☆31Updated 7 months ago
- Machine Theory of Mind Reading List. Built upon EMNLP Findings 2023 Paper: Towards A Holistic Landscape of Situated Theory of Mind in Lar…☆144Updated 8 months ago