tatsu-lab / opinions_qa
☆104Updated 9 months ago
Alternatives and similar repositories for opinions_qa:
Users that are interested in opinions_qa are comparing it to the libraries listed below
- Repository for the Bias Benchmark for QA dataset.☆100Updated last year
- The Prism Alignment Project☆66Updated 9 months ago
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆131Updated 2 months ago
- Inspecting and Editing Knowledge Representations in Language Models☆112Updated last year
- ☆47Updated last year
- ☆124Updated last year
- Röttger et al. (NAACL 2024): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"☆84Updated last week
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆71Updated last year
- Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model☆66Updated 2 years ago
- ☆34Updated 2 months ago
- ☆25Updated 4 months ago
- This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…☆59Updated last year
- Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)☆57Updated last year
- Code for "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Mod…☆33Updated 11 months ago
- ☆40Updated 9 months ago
- A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.☆62Updated 3 months ago
- The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…☆89Updated 3 years ago
- [NAACL'25] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering☆47Updated 2 months ago
- ☆21Updated last year
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paper☆70Updated 3 years ago
- This repository contains data, code and models for contextual noncompliance.☆20Updated 7 months ago
- ☆174Updated 2 years ago
- ☆22Updated 11 months ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆58Updated 2 years ago
- ☆44Updated 5 months ago
- ☆33Updated 4 months ago
- A curated list of research papers and resources on Cultural LLM.☆36Updated 4 months ago
- Code and data accompanying the paper "TRUE: Re-evaluating Factual Consistency Evaluation".☆75Updated 3 months ago
- ☆71Updated last year
- Token-level Reference-free Hallucination Detection☆94Updated last year