Lslland/T-Vaccine

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Lslland/T-Vaccine)

Lslland / T-Vaccine

☆19

Alternatives and similar repositories for T-Vaccine

Users that are interested in T-Vaccine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

git-disl / Vaccine
View on GitHub
This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)
☆51Jan 15, 2026Updated 6 months ago
ethz-spylab / unlearning-vs-safety
View on GitHub
☆27Oct 6, 2024Updated last year
zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year
git-disl / Booster
View on GitHub
This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturba…
☆41Mar 22, 2025Updated last year
EnnengYang / Efficient-WEMoE
View on GitHub
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging. Arxiv, 2024.
☆16Oct 28, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
LLLeoLi / LARF
View on GitHub
[EMNLP 2025] Layer-Aware Representation Filtering: Purifying Finetuning Data to Preserve LLM Safety Alignment
☆15Jul 22, 2025Updated last year
HelloEveryboby / Butler
View on GitHub
Butler 是一个用于自动化服务管理和任务调度的工具项目。
☆17Updated this week
arobey1 / advbench
View on GitHub
☆45Mar 3, 2023Updated 3 years ago
git-disl / Lisa
View on GitHub
This is the official code for the paper "Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning" (NeurIPS2024)
☆29Sep 10, 2024Updated last year
boyiwei / CoTaEval
View on GitHub
[NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models
☆17Jul 17, 2024Updated 2 years ago
OPTML-Group / Unlearn-Simple
View on GitHub
[NeurIPS25] Official repo for "Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning"
☆45Oct 3, 2025Updated 9 months ago
jaechan-repo / muse_bench
View on GitHub
☆33Aug 9, 2024Updated last year
PKU-YuanGroup / AsFT
View on GitHub
Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".
☆37Jul 10, 2025Updated last year
Jayfeather1024 / Backdoor-Enhanced-Alignment
View on GitHub
☆24Dec 8, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
houseme / sensitive-rs
View on GitHub
Sensitive-rs is a Rust library for finding, validating, filtering, and replacing sensitive words. It provides efficient algorithms to han…
☆26Jul 22, 2026Updated last week
tanganke / subspace_fusion
View on GitHub
Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"
☆14Mar 28, 2024Updated 2 years ago
princeton-polaris-lab / Evaluating-Durable-Safeguards
View on GitHub
[ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs
☆13Jun 20, 2025Updated last year
licong-lin / negative-preference-optimization
View on GitHub
☆76Jul 15, 2024Updated 2 years ago
rishub-tamirisa / tamper-resistance
View on GitHub
[ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"
☆68Jun 9, 2025Updated last year
llm-misinformation / llm-misinformation
View on GitHub
The dataset and code for the ICLR 2024 paper "Can LLM-Generated Misinformation Be Detected?"
☆85Jul 11, 2026Updated 2 weeks ago
aladinD / SafeMERGE
View on GitHub
Code for SafeMERGE (ICLR 2025).
☆15Apr 1, 2025Updated last year
sail-sg / closer-look-LLM-unlearning
View on GitHub
[ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models
☆49Dec 4, 2024Updated last year
LLM-Tuning-Safety / LLMs-Finetuning-Safety
View on GitHub
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20…
☆358Feb 23, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
EleutherAI / deep-ignorance
View on GitHub
☆20Jan 7, 2026Updated 6 months ago
graldij / transformer-fusion
View on GitHub
Official repository of the "Transformer Fusion with Optimal Transport" paper, published as a conference paper at ICLR 2024.
☆31Apr 19, 2024Updated 2 years ago
phax / en16931-cii2ubl
View on GitHub
Converter for EN16931 invoices from CII to UBL
☆45Updated this week
SaFo-Lab / FIUBench
View on GitHub
A Task of Fictitious Unlearning for VLMs
☆27Apr 6, 2025Updated last year
IBM / SafeLoRA
View on GitHub
Github repo for NeurIPS 2024 paper "Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models"
☆29Dec 21, 2025Updated 7 months ago
CryptoAILab / misalignment
View on GitHub
[NDSS'25] The official implementation of safety misalignment.
☆19Jan 8, 2025Updated last year
AI21Labs / factor
View on GitHub
Code and data for the FACTOR paper
☆54Nov 15, 2023Updated 2 years ago
GodXuxilie / PromptAttack
View on GitHub
An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)
☆117Jan 21, 2025Updated last year
git-disl / awesome_LLM-harmful-fine-tuning-papers
View on GitHub
A survey on harmful fine-tuning attack for large language model (ACM CSUR)
☆247Jun 22, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
osehmathias / lisa
View on GitHub
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
☆38Apr 4, 2024Updated 2 years ago
DAMO-NLP-SG / multilingual-safety-for-LLMs
View on GitHub
[ICLR 2024]Data for "Multilingual Jailbreak Challenges in Large Language Models"
☆106Mar 7, 2024Updated 2 years ago
luyug / Dense
View on GitHub
A toolkit for building dense retrievers with deep language models.
☆63Sep 24, 2021Updated 4 years ago
TCMAI-BJTU / LingdanLLM
View on GitHub
TCM Lingdan LLM
☆52Jun 1, 2026Updated last month
QwenLM / online_merging_optimizers
View on GitHub
Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
☆82Jun 19, 2024Updated 2 years ago
lfy79001 / TableQAKit
View on GitHub
A Toolkit for Table-based Question Answering
☆117Oct 19, 2023Updated 2 years ago
qinyiwei / InfoBench
View on GitHub
☆61Aug 22, 2024Updated last year