yxwan123 / BiasAskerLinks

☆39

Alternatives and similar repositories for BiasAsker

Users that are interested in BiasAsker are comparing it to the libraries listed below

Sorting:

Jarviswang94 / MTTM
MTTM: Metamorphic Testing for Textual Content Moderation Software
☆32Updated 2 years ago
Amaodemao / BiasPainter
basically all the things I used for this article
☆25Updated 8 months ago
Jarviswang94 / Multilingual_safety_benchmark
Multilingual safety benchmark for Large Language Models
☆52Updated last year
WebPAI / MRWeb
☆34Updated 6 months ago
yxwan123 / LogicAsker
☆31Updated 7 months ago
CUHK-ARISE / PsychoBench
Benchmarking LLMs' Psychological Portrayal
☆123Updated 8 months ago
CUHK-ARISE / EmotionBench
Benchmarking LLMs' Emotional Alignment with Humans
☆111Updated 7 months ago
penguinnnnn / awesome-llm-and-society
Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.
☆49Updated last year
pillowsofwind / Course-Correction
[EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"
☆19Updated 11 months ago
CUHK-ARISE / GAMABench
Benchmarking LLMs' Gaming Ability in Multi-Agent Environments
☆88Updated 4 months ago
SeekingDream / Static-to-Dynamic-LLMEval
The official GitHub repository of the paper "Recent advances in large langauge model benchmarks against data contamination: From static t…
☆45Updated last week
SALT-NLP / chain-of-thought-bias
☆28Updated last year
OSU-NLP-Group / AgentSafety
☆110Updated 4 months ago
declare-lab / red-instruct
Codes and datasets of the paper Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
☆105Updated last year
CUHK-ARISE / LLMPersonality
Code and Results of the Paper: On the Reliability of Psychological Scales on Large Language Models
☆30Updated last year
ydyjya / LLM-IHS-Explanation
☆51Updated last year
chujiezheng / LLM-Safeguard
Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"
☆96Updated 4 months ago
SihengLi99 / LLM-Honesty-Survey
[2025-TMLR] A Survey on the Honesty of Large Language Models
☆59Updated 9 months ago
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆161Updated 6 months ago
Lordog / R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)
☆88Updated 4 months ago
Jihuai-wpy / InferAligner
☆35Updated 11 months ago
GAIR-NLP / alignment-for-honesty
☆75Updated last year
zhenyu-02 / LogitLens4LLMs
A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…
☆109Updated last month
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆137Updated last year
ntunlp / LLMSanitize
An open-source library for contamination detection in NLP datasets and Large Language Models (LLMs).
☆56Updated last year
RUCAIBox / HaluEval-2.0
☆47Updated last year
jinzhuoran / RWKU
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024
☆82Updated 11 months ago
RUCAIBox / HaluAgent
☆21Updated last year
jonnypei / acl23-preadd
☆11Updated 2 years ago
xyliu-cs / RISE
Official Implementation of RISE (Reinforcing Reasoning with Self-Verification)
☆28Updated last month