valentinhofmann / dialect-prejudice
☆32Updated last month
Related projects ⓘ
Alternatives and complementary repositories for dialect-prejudice
- ☆94Updated 6 months ago
- ☆20Updated last year
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆62Updated last year
- ☆28Updated last month
- The Prism Alignment Project☆37Updated 6 months ago
- Repository for research in the field of Responsible NLP at Meta.☆186Updated last week
- The official repo for SocKET: Social Knowledge Evaluation Tests☆19Updated last year
- ☆199Updated this week
- Code repository for the paper "Mission: Impossible Language Models."☆39Updated 10 months ago
- A collection of works that investigate social agents, simulations and their real-world impact in text, embodied, and robotics contexts.☆63Updated 5 months ago
- Code accompanying "How I learned to start worrying about prompt formatting".☆95Updated last month
- ☆75Updated last month
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆68Updated 7 months ago
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆67Updated 3 years ago
- This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the p…☆28Updated 2 months ago
- PAIR.withgoogle.com and friend's work on interpretability methods☆150Updated 3 weeks ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.☆62Updated this week
- ☆111Updated last year
- Repository for the Bias Benchmark for QA dataset.☆87Updated 10 months ago
- Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆63Updated 10 months ago
- ☆21Updated 8 months ago
- ☆44Updated last month
- Steering Llama 2 with Contrastive Activation Addition☆98Updated 5 months ago
- A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents☆19Updated last year
- OLAPH: Improving Factuality in Biomedical Long-form Question Answering☆38Updated 2 months ago
- ☆43Updated last month
- ☆63Updated 7 months ago
- Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs☆39Updated 3 weeks ago
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆62Updated 5 months ago
- Package to extract connotation frames☆80Updated 11 months ago