HKUST-KnowComp / PrivaCI-BenchLinks

☆12

Alternatives and similar repositories for PrivaCI-Bench

Users that are interested in PrivaCI-Bench are comparing it to the libraries listed below

Sorting:

baixianghuang / HalluEditBench
Can Knowledge Editing Really Correct Hallucinations? (ICLR 2025)
☆24Updated 2 months ago
MiaoXiong2320 / llm-uncertainty
code repo for ICLR 2024 paper "Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs"
☆125Updated last year
Jometeorie / KnowledgeSpread
☆33Updated 9 months ago
ucl-dark / llm_debate
Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"
☆113Updated last year
ryoungj / ToolEmu
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆152Updated last year
ChnQ / TracingLLM
☆28Updated last year
SafeAILab / RAIN
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
☆96Updated last year
sanjibanc / agent_prm
☆43Updated 5 months ago
ShuoTang123 / MATRIX
Implementation of the MATRIX framework (ICML 2024)
☆57Updated last year
princeton-nlp / corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156
☆35Updated last year
JacksonWuxs / UsableXAI_LLM
Using Explanations as a Tool for Advanced LLMs
☆66Updated 10 months ago
icip-cas / Verifier-Engineering
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
☆61Updated 8 months ago
flamewei123 / DEPN
☆24Updated last year
Lordog / R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)
☆82Updated 2 months ago
OSU-NLP-Group / AgentSafety
☆99Updated 3 months ago
tonychenxyz / selfie
This repository contains the code and data for the paper "SelfIE: Self-Interpretation of Large Language Model Embeddings" by Haozhe Chen,…
☆50Updated 7 months ago
PKU-Alignment / beavertails
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
☆151Updated last year
Scarelette / CulturePark
☆24Updated 9 months ago
AlphaPav / mem-kk-logic
On Memorization of Large Language Models in Logical Reasoning
☆70Updated 4 months ago
hkust-nlp / Activation_Decoding
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)
☆61Updated last year
MingyuJ666 / The-Impact-of-Reasoning-Step-Length-on-Large-Language-Models
[ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…
☆46Updated 2 months ago
zitian-gao / SC-MCTS
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆48Updated 8 months ago
scaleapi / plansearch
e
☆39Updated 3 months ago
amazon-science / PrefEval
☆23Updated 2 months ago
Vaidehi99 / InfoDeletionAttacks
☆44Updated 5 months ago
Reason-Wang / NAT
[NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…
☆26Updated last year
zjunlp / KnowledgeCircuits
[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers
☆151Updated 5 months ago
likenneth / dialogue_action_token
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
☆26Updated last year
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆103Updated last week
SuperBruceJia / Awesome-LLM-Self-Consistency
Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models
☆105Updated 2 weeks ago