PKU-Alignment/llms-resist-alignment

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/PKU-Alignment/llms-resist-alignment)

PKU-Alignment / llms-resist-alignment

[ACL2025 Best Paper] Language Models Resist Alignment

☆43

Alternatives and similar repositories for llms-resist-alignment

Users that are interested in llms-resist-alignment are comparing it to the libraries listed below

Sorting:

XiongPengNUS / PandaShifu
View on GitHub
☆15Updated this week
wicai24 / DOOR-Alignment
View on GitHub
☆16Apr 7, 2025Updated 10 months ago
epri-dev / PV-BESS-Hybrid
View on GitHub
☆19Sep 22, 2025Updated 5 months ago
VimalWill / Vstream
View on GitHub
Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)
☆10Feb 2, 2024Updated 2 years ago
thu-coai / JailbreakDefense_GoalPriority
View on GitHub
[ACL 2024] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
☆29Jul 9, 2024Updated last year
katiekang1998 / reasoning_generalization
View on GitHub
☆33Jan 7, 2025Updated last year
GenTelLab / trustclaw
View on GitHub
☆43Feb 9, 2026Updated 3 weeks ago
sandialabs / quest_planning
View on GitHub
QuESt Planning is a long-term power system capacity expansion planning model that identifies cost-optimal energy storage, generation, and…
☆14Feb 4, 2026Updated 3 weeks ago
asu-iris / course_robotics
View on GitHub
☆19Nov 20, 2025Updated 3 months ago
PRIS-CV / MSSRM
View on GitHub
An implementation of MSSRM method
☆11Mar 23, 2023Updated 2 years ago
HiTAndRunaway / YoLoV5-helmet-recogition
View on GitHub
2020湖南省第一届人工智能大赛参赛作品
☆11Feb 17, 2022Updated 4 years ago
JuliaEnergy / PowerDynamicsExamples
View on GitHub
Example Systems using PowerDynamics.jl
☆12Oct 10, 2022Updated 3 years ago
ROBUST-NL / paused_ev_charging
View on GitHub
Source code for the paper titled: "Unlocking the full potential of smart charging: Addressing paused and delayed charging problems in ele…
☆11May 22, 2024Updated last year
klamike / lpviz
View on GitHub
Visualize linear programming at https://lpviz.net
☆33Jan 20, 2026Updated last month
zhangluoyang / Yolo
View on GitHub
yolo目标检测算法
☆15Jul 27, 2025Updated 7 months ago
kimvc7 / HDL
View on GitHub
☆12Mar 15, 2023Updated 2 years ago
HydroXai / Enhancing-Safety-in-Large-Language-Models
View on GitHub
Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…
☆11Nov 26, 2024Updated last year
ineveLoppiliF / Online-Isolation-Forest
View on GitHub
☆16Jan 16, 2025Updated last year
a875560134 / yoloY
View on GitHub
☆14May 1, 2023Updated 2 years ago
caltech-netlab / digital-twin-dataset
View on GitHub
☆14Jan 8, 2026Updated last month
lgy0404 / MemGUI-Bench
View on GitHub
Official code repo for the paper "MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments"
☆25Updated this week
cqu20160901 / DETR_onnx_tensorRT_V2
View on GitHub
DETR tensor去除推理过程无用辅助头+fp16部署再次加速+解决转tensorrt 输出全为0问题的新方法。
☆12Jan 9, 2024Updated 2 years ago
mxzheng / TrojViT
View on GitHub
[CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang
☆14Jan 5, 2024Updated 2 years ago
PRIS-CV / Category-Specific-Prompt
View on GitHub
Code release for "Category-Specific Prompts for Animal Action Recognition with Pretrained Vision-Language Models"
☆14Feb 21, 2024Updated 2 years ago
surajkothawade / talisman
View on GitHub
[ECCV 2022] "TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information" by…
☆10Sep 21, 2022Updated 3 years ago
sshin23 / opf-on-gpu
View on GitHub
☆10Mar 25, 2024Updated last year
SongW-SW / CEB
View on GitHub
☆13Jun 25, 2025Updated 8 months ago
MidiyaZhu / MePO
View on GitHub
Code for Rethinking Prompt Optimizers: From Prompt Merits to Optimization
☆12Jan 12, 2026Updated last month
shn66 / LAMPOS
View on GitHub
LAMPOS, a strategy-based solution approach for mp-MILPs for real-time mixed-integer MPC with sub-optimality quantification
☆11Jun 25, 2023Updated 2 years ago
zjunlp / LookAheadTuning
View on GitHub
[WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews
☆17Dec 14, 2025Updated 2 months ago
CUHK-Shenzhen-SE / RetromorphicTesting
View on GitHub
☆11Jan 19, 2025Updated last year
Adlik / zen_nas
View on GitHub
Zen-NAS, a lightning fast, training-free Neural Architecture Searching algorithm
☆11Nov 12, 2021Updated 4 years ago
cha15yq / MRC-Crowd
View on GitHub
Implementation of "Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes"
☆12Oct 2, 2024Updated last year
tli725 / JL-Corpus
View on GitHub
For further understanding the wide array of emotions embedded in human speech, we are introducing an emotional speech corpus. In contrast…
☆11Oct 29, 2018Updated 7 years ago
ZrW00 / GraCeFul
View on GitHub
The code implementation of GraCeFul (Accepted in COLING 2025)
☆13Jan 27, 2025Updated last year
Update-For-Integrated-Business-AI / CORU
View on GitHub
☆16Jul 7, 2025Updated 7 months ago
shiqichen17 / SPA
View on GitHub
Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"
☆33Nov 1, 2025Updated 4 months ago
tanganke / subspace_fusion
View on GitHub
Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"
☆14Mar 28, 2024Updated last year
utiasASRL / sdprlayer
View on GitHub
☆13Nov 5, 2025Updated 3 months ago