cicl-stanford / mocaLinks
Language model evaluation for morality and causality
☆19Updated last year
Alternatives and similar repositories for moca
Users that are interested in moca are comparing it to the libraries listed below
Sorting:
- 👻 Code and benchmark for our EMNLP 2023 paper - "FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions"☆55Updated last year
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆218Updated last year
- Repository for research in the field of Responsible NLP at Meta.☆202Updated 4 months ago
- Code repository for the paper "Mission: Impossible Language Models."☆54Updated 4 months ago
- Apps built using Inspired Cognition's Critique.☆58Updated 2 years ago
- [ACL 2022] CLUES: A Benchmark for Learning Classifiers using Natural Language Explanations☆10Updated 3 years ago
- Inspecting and Editing Knowledge Representations in Language Models☆116Updated 2 years ago
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆116Updated 2 months ago
- ☆96Updated last year
- ☆74Updated last year
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆125Updated last year
- The Prism Alignment Project☆79Updated last year
- A Toolkit for Distributional Control of Generative Models☆73Updated last month
- ☆51Updated last year
- CausalGym: Benchmarking causal interpretability methods on linguistic tasks☆47Updated 9 months ago
- DialOp: Decision-oriented dialogue environments for collaborative language agents☆110Updated 10 months ago
- ☆23Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆92Updated last year
- A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.☆105Updated last year
- Aligning AI With Shared Human Values (ICLR 2021)☆298Updated 2 years ago
- ☆36Updated 2 years ago
- PAIR.withgoogle.com and friend's work on interpretability methods☆202Updated last week
- A curated list of research papers and resources on Cultural LLM.☆48Updated 11 months ago
- Resources for cultural NLP research☆103Updated 4 months ago
- ☆54Updated 2 years ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆81Updated 2 months ago
- ☆41Updated last year
- A corpus and code for understanding norms and subjectivity. 🤖☆52Updated 11 months ago
- ☆210Updated 2 years ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆152Updated last year