git-disl / VirusLinks
This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"
☆53Updated 11 months ago
Alternatives and similar repositories for Virus
Users that are interested in Virus are comparing it to the libraries listed below
Sorting:
- Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)☆26Updated 10 months ago
- Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning☆52Updated 2 months ago
- ☆86Updated last year
- [TMLR'24] This repository includes the official implementation our paper "FedConv: Enhancing Convolutional Neural Networks for Handling D…☆25Updated last year
- Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs☆24Updated 6 months ago
- The official implementation of Preference Data Reward-Augmentation.☆18Updated 8 months ago
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆143Updated 2 months ago
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆82Updated 11 months ago
- The Granite Guardian models are designed to detect risks in prompts and responses.☆123Updated 2 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆85Updated 9 months ago
- [ACL 2025] How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆45Updated 5 months ago
- The official implementation of "Learning Compact Vision Tokens for Efficient Large Multimodal Models"☆29Updated 6 months ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆61Updated last year
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆45Updated last month
- Official Code for paper "Towards Efficient and Effective Unlearning of Large Language Models for Recommendation" (Frontiers of Computer S…☆38Updated last year
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆124Updated 5 months ago
- ☆144Updated 8 months ago
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Updated last year
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆123Updated last year
- [ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast☆117Updated last year
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆40Updated 2 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆73Updated 7 months ago
- The first dense retrieval model that can be prompted like an LM☆89Updated 7 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Updated 2 years ago
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Updated last year
- The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"☆14Updated last year
- ☆38Updated last year
- Functional Benchmarks and the Reasoning Gap☆90Updated last year
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Updated last year
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Updated last year