myZeratul / Causal-DebiasLinks

☆10

Alternatives and similar repositories for Causal-Debias

Users that are interested in Causal-Debias are comparing it to the libraries listed below

Sorting:

snw2021 / LLM_Unlearning_Papers
☆26Updated 2 years ago
thunlp / ONION
Official implementation of the EMNLP 2021 paper "ONION: A Simple and Effective Defense Against Textual Backdoor Attacks"
☆35Updated 4 years ago
xinleihe / toxic-prompt
☆27Updated 2 years ago
leix28 / prompt-universal-vulnerability
Implementation of the paper "Exploring the Universal Vulnerability of Prompt-based Learning Paradigm" on Findings of NAACL 2022
☆31Updated 3 years ago
HITsz-TMG / Ext-Sub
Official implementation of our paper "Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Opera…
☆11Updated last year
KID-22 / LLM-IR-Bias-Fairness-Survey
This is the repo for the survey of Bias and Fairness in IR with LLMs.
☆59Updated 2 months ago
Irenehere / Auto-Debias
☆34Updated 3 years ago
wang2226 / FOLK
[EMNLP 2023] Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models
☆25Updated last year
Arstanley / Awesome-Trustworthy-RAG
☆92Updated 4 months ago
thunlp / HiddenKiller
Code and data of the ACL-IJCNLP 2021 paper "Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger"
☆43Updated 3 years ago
llm-misinformation / llm-misinformation-survey
Paper list for the survey "Combating Misinformation in the Age of LLMs: Opportunities and Challenges" and the initiative "LLMs Meet Misin…
☆103Updated last year
thunlp / StyleAttack
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"
☆46Updated 3 years ago
PLUM-Lab / Mocheg
Dataset and Code for Multimodal Fact Checking and Explanation Generation (Mocheg)
☆59Updated 2 years ago
llm-misinformation / llm-misinformation
The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"
☆78Updated last year
au-revoir / model-editing-ft
☆13Updated last year
SALT-NLP / Efficient_Unlearning
☆38Updated 2 years ago
THU-KEG / WaterBench
[ACL2024-Main] Data and Code for WaterBench: Towards Holistic Evaluation of LLM Watermarks
☆28Updated 2 years ago
lancopku / Embedding-Poisoning
Code for the paper "Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models" (NAACL-…
☆43Updated 4 years ago
jinhaoduan / SAR
[ACL 2024] Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
☆59Updated last year
ShannonAI / backdoor_nlg
☆18Updated 4 years ago
BunsenFeng / botsay
What does the bot say? ACL 2024
☆23Updated last year
Chen-X666 / privacy-preserving-prompt
Privacy-Preserving Prompt Tuning for Large Language Model
☆28Updated last year
Hunter-DDM / knowledge-neurons
Code for the ACL-2022 paper "Knowledge Neurons in Pretrained Transformers"
☆173Updated last year
THU-BPM / Robust_Watermark
Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.
☆35Updated last year
Xianjun-Yang / Awesome_papers_on_LLMs_detection
The lastest paper about detection of LLM-generated text and code
☆281Updated 5 months ago
spirit-moon-fly / CAL
☆26Updated last year
jiayingwu19 / SheepDog
Data and code for "Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks" (KDD 2024)
☆41Updated 10 months ago
Yangyi-Chen / PaperList-Trustworthy-Applications
Mostly recording papers about models' trustworthy applications. Intending to include topics like model evaluation & analysis, security, c…
☆21Updated 2 years ago
thunlp / OpenBackdoor
An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)
☆195Updated 2 years ago
jiayingwu19 / Prompt-and-Align
Data and code for "Prompt-and-Align: Prompt-Based Social Alignment for Few-Shot Fake News Detection" (CIKM 2023)
☆36Updated last year