MarkGHX / BiScopeLinks

Official Implementation of NeurIPS 2024 paper - BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens

☆27

Alternatives and similar repositories for BiScope

Users that are interested in BiScope are comparing it to the libraries listed below

Sorting:

SolidShen / RIPPLE_official
☆20Updated last year
Gwinhen / DRUPE
Distribution Preserving Backdoor Attack in Self-supervised Learning
☆20Updated last year
Megum1 / ODSCAN
[IEEE S&P'24] ODSCAN: Backdoor Scanning for Object Detection Models
☆20Updated 2 months ago
PurduePAML / DBS
☆18Updated 3 years ago
Gwinhen / BackdoorVault
A toolbox for backdoor attacks.
☆22Updated 2 years ago
Lyz1213 / BadEdit
☆36Updated last year
lancopku / codable-watermarking-for-llm
Repository for Towards Codable Watermarking for Large Language Models
☆38Updated 2 years ago
KaiyuanZh / OrthogLinearBackdoor
[Oakland 2024] Exploring the Orthogonality and Linearity of Backdoor Attacks
☆26Updated 7 months ago
Megum1 / BEAGLE
[NDSS'23] BEAGLE: Forensics of Deep Learning Backdoor Attack for Better Defense
☆17Updated last year
Lyz1213 / Backdoored_PPLM
☆15Updated last year
inspire-group / RobustRAG
☆21Updated last year
yyl-github-1896 / CodeTAE
This is the official code repository for paper "Exploiting the Adversarial Example Vulnerability of Transfer Learning of Source Code".
☆16Updated 2 months ago
CryptoAILab / misalignment
[NDSS'25] The official implementation of safety misalignment.
☆17Updated 11 months ago
qingjiesjtu / USC
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
☆63Updated 11 months ago
ltroin / llm_attack_defense_arena
☆82Updated 3 months ago
Megum1 / LOTUS
[CVPR'24] LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning
☆15Updated 10 months ago
bangawayoo / mb-lm-watermarking
multi-bit language model watermarking (NAACL 24)
☆17Updated last year
Megum1 / DFST
[AAAI'21] Deep Feature Space Trojan Attack of Neural Networks by Controlled Detoxification
☆29Updated 11 months ago
Gwinhen / MOTH
This is the implementation for IEEE S&P 2022 paper "Model Orthogonalization: Class Distance Hardening in Neural Networks for Better Secur…
☆11Updated 3 years ago
WUSTL-CSPL / LLMJailbreak
☆37Updated last year
cnut1648 / Model-Fingerprint
Fingerprint large language models
☆46Updated last year
LIONS-EPFL / Charmer
Revisiting Character-level Adversarial Attacks for Language Models, ICML 2024
☆19Updated 9 months ago
thu-coai / JailbreakDefense_GoalPriority
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆29Updated last year
casperllm / CASPER
☆15Updated last year
sleeepeer / PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
☆218Updated 3 weeks ago
LLMSecurity / MasterKey
MASTERKEY is a framework designed to explore and exploit vulnerabilities in large language model chatbots by automating jailbreak attacks…
☆29Updated last year
SolidShen / BAIT
🔥🔥🔥 Detecting hidden backdoors in Large Language Models with only black-box access
☆50Updated 6 months ago
Raytsang123 / CLIBE
[NDSS 2025] "CLIBE: Detecting Dynamic Backdoors in Transformer-based NLP Models"
☆23Updated 3 months ago
ZhangZhuoSJTU / LINT
☆17Updated last year
zhangrui4041 / Instruction_Backdoor_Attack
☆26Updated last year