YFHuangxxxx / CBBQLinks

☆27

Alternatives and similar repositories for CBBQ

Users that are interested in CBBQ are comparing it to the libraries listed below

Sorting:

thu-coai / DiaSafety
This repo is for the paper: On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark
☆25Updated 3 years ago
NLP2CT / NLPCC-2025-Task1
NLPCC-2025 Shared-Task 1: LLM-Generated Text Detection
☆16Updated 6 months ago
RUCAIBox / Language-Specific-Neurons
☆88Updated 11 months ago
UM-FAH-Yuan / FIE2025
☆14Updated 5 months ago
thunlp / LLM-generated-text-detection
☆14Updated 2 years ago
THU-BPM / CHEF
The source code of paper "CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking"
☆81Updated 2 years ago
OpenLMLab / Sniffer
☆27Updated 2 years ago
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆148Updated last year
RUCAIBox / HaluEval-2.0
☆47Updated last year
PKU-Alignment / beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
☆167Updated 2 years ago
hongbinye / Cognitive-Mirage-Hallucinations-in-LLMs
Repository for the paper "Cognitive Mirage: A Review of Hallucinations in Large Language Models"
☆47Updated 2 years ago
fanqiwan / Explore-Instruct
EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration
☆36Updated last year
CUHK-ARISE / LLMPersonality
Code and Results of the Paper: On the Reliability of Psychological Scales on Large Language Models
☆30Updated last year
HillZhang1999 / ICD
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
☆69Updated last year
icip-cas / awesome-auto-alignment
Collection of papers for scalable automated alignment.
☆94Updated last year
yinzhangyue / SelfAware
Do Large Language Models Know What They Don’t Know?
☆102Updated last year
Abbey4799 / CELLO
Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)
☆50Updated last year
Hunter-DDM / knowledge-neurons
Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"
☆173Updated last year
OpenMOSS / HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"
☆136Updated last year
GAIR-NLP / alignment-for-honesty
☆76Updated last year
wangcunxiang / LLM-Factuality-Survey
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
☆340Updated last year
qinyiwei / InfoBench
☆57Updated last year
ICTMCG / Awesome-Machine-Generated-Text
Continuously updated list of related resources for generative LLMs like GPT and their analysis and detection.
☆228Updated 6 months ago
MikeGu721 / XiezhiBenchmark
☆98Updated last year
zhu-minjun / PAlign
Personality Alignment of Language Models
☆51Updated 4 months ago
shmsw25 / FActScore
A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…
☆406Updated 7 months ago
zzhang0179 / Unveiling-Linguistic-Regions-in-LLMs
[ACL 2024] Unveiling Linguistic Regions in Large Language Models
☆33Updated last year
blcuicall / OMGEval
OMGEval😮: An Open Multilingual Generative Evaluation Benchmark for Foundation Models
☆35Updated last year
hkust-nlp / felm
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆61Updated last year
thu-coai / ComplexBench
Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)
☆97Updated 9 months ago