UCSB-NLP-Chang/SemanticSmooth

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UCSB-NLP-Chang/SemanticSmooth)

UCSB-NLP-Chang / SemanticSmooth

Implementation of paper 'Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing'

☆24

Alternatives and similar repositories for SemanticSmooth

Users that are interested in SemanticSmooth are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ledllm / ledllm
View on GitHub
☆24Jun 16, 2024Updated 2 years ago
UCSB-NLP-Chang / PromptBoosting
View on GitHub
☆17Sep 5, 2023Updated 2 years ago
UCSB-NLP-Chang / Fairness-Reprogramming
View on GitHub
☆16Oct 16, 2023Updated 2 years ago
Shahriar-0 / Neural-Networks-and-Deep-Learning-Course-Projects-S2024
View on GitHub
neural network and deep learning course projects to work and design on different problems such as classification, regression, optimisatio…
☆15Aug 29, 2024Updated last year
UCSB-NLP-Chang / ThinkPrune
View on GitHub
☆46Sep 27, 2025Updated 9 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
UCSB-NLP-Chang / SelfDenoise
View on GitHub
☆14May 7, 2024Updated 2 years ago
UCSB-NLP-Chang / ULD
View on GitHub
Implementation of paper 'Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference' [NeurIPS'24…
☆26Jun 14, 2024Updated 2 years ago
poloclub / llm-self-defense
View on GitHub
LLM Self Defense: By Self Examination, LLMs know they are being tricked
☆52May 21, 2024Updated 2 years ago
SafeAILab / RAIN
View on GitHub
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
☆99May 23, 2024Updated 2 years ago
UCSB-NLP-Chang / DiffusionDisentanglement
View on GitHub
Official implementation of the paper "Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
☆175Oct 8, 2023Updated 2 years ago
auspicious3000 / ProsodyLM
View on GitHub
ProsodyLM: Uncovering the Emerging Prosody Processing Capabilities in Speech Language Models
☆46Nov 18, 2025Updated 8 months ago
theshi-1128 / jailbreak-bench
View on GitHub
The most comprehensive and accurate LLM jailbreak attack benchmark by far
☆21Mar 22, 2025Updated last year
alphadl / SafeLLM_with_IntentionAnalysis
View on GitHub
Towards Safe LLM with our simple-yet-highly-effective Intention Analysis Prompting
☆21Mar 25, 2024Updated 2 years ago
UCSB-NLP-Chang / diffusion_resampling
View on GitHub
Implementation for "Correcting Diffusion Generation through Resampling" [CVPR 2024]
☆34Dec 12, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
XHMY / AutoDefense
View on GitHub
AutoDefense: Multi-Agent LLM Defense against Jailbreak Attacks
☆68Jan 15, 2026Updated 6 months ago
zthang / Focus
View on GitHub
☆24Feb 3, 2024Updated 2 years ago
UCSB-NLP-Chang / TextGrad
View on GitHub
☆24Sep 20, 2023Updated 2 years ago
eurekayuan / RigorLLM
View on GitHub
Implementation for "RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content"
☆24Jul 28, 2024Updated last year
arobey1 / smooth-llm
View on GitHub
☆135Nov 13, 2023Updated 2 years ago
THUDM / grb
View on GitHub
Graph Robustness Benchmark: A scalable, unified, modular, and reproducible benchmark for evaluating the adversarial robustness of Graph M…
☆100Nov 6, 2023Updated 2 years ago
OODRobustBench / OODRobustBench
View on GitHub
OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024
☆23Jul 25, 2024Updated 2 years ago
THU-KEG / Skill-Neuron
View on GitHub
Source code for EMNLP2022 paper "Finding Skill Neurons in Pre-trained Transformers via Prompt Tuning".
☆18Mar 13, 2023Updated 3 years ago
umd-huang-lab / Dynamics-Aware-Robust-Training
View on GitHub
ICLR 2023 paper "Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness" by Yuancheng Xu, Yanchao Sun, Micah Gold…
☆26May 2, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
NVlabs / causal_comp
View on GitHub
☆23Oct 17, 2022Updated 3 years ago
rottaca / FallDetectionProject
View on GitHub
This is the code repository for a project at Ulm University. It's a fall detection system based on address-event-based cameras.
☆11Sep 29, 2017Updated 8 years ago
sammcj / zsh-bootstrap
View on GitHub
bootstrap my zsh shell
☆16Mar 28, 2026Updated 3 months ago
Lyz1213 / BadEdit
View on GitHub
☆38Oct 17, 2024Updated last year
ejones313 / auditing-llms
View on GitHub
☆61Mar 9, 2023Updated 3 years ago
jrazi / repo2file4gpt
View on GitHub
This project aims to convert the content of GitHub repositories into a structured, machine-readable format, enabling AI models like ChatG…
☆12May 13, 2024Updated 2 years ago
serval-uni-lu / tabularbench
View on GitHub
TabularBench: Adversarial robustness benchmark for tabular data
☆22Apr 25, 2026Updated 3 months ago
shiningrain / JailGuard
View on GitHub
☆32Mar 16, 2025Updated last year
moshafieeha / UT-ECE-Student-Resources
View on GitHub
A curated list of valuable resources from our studies at the University of Tehran (UT), School of Electrical and Computer Engineering (EC…
☆72Feb 15, 2026Updated 5 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Lucas-TY / llm_Implicit_reference
View on GitHub
Official Implementation of implicit reference attack
☆11Oct 16, 2024Updated last year
littleSunlxy / FedPU-torch
View on GitHub
Official PyTorch Implementation of Federated Learning with Positive and Unlabeled Data
☆10Aug 12, 2022Updated 3 years ago
hjian42 / CommunityLM
View on GitHub
[COLING 2022]: CommunityLM: Probing Partisan Worldviews from Language Models
☆14Jan 31, 2023Updated 3 years ago
Huangxy-Minel / System-Design-for-Federated-Learning
View on GitHub
Paper list of federated learning: About system design
☆13Apr 13, 2022Updated 4 years ago
kyegomez / SimplifiedTransformers
View on GitHub
SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-bl…
☆15Updated this week
zeyuanyin / tiny-imagenet
View on GitHub
☆20Feb 8, 2024Updated 2 years ago
antgroup / Agent3Sigma-Canary
View on GitHub
Agent3σ-Canary is an evaluation framework for AI Agent security in realistic runtime environments.
☆33Jun 24, 2026Updated last month