Twilight92z / Quantize-Watermark
☆20Updated last year
Alternatives and similar repositories for Quantize-Watermark
Users that are interested in Quantize-Watermark are comparing it to the libraries listed below
Sorting:
- Codebase for decoding compressed trust.☆23Updated last year
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆93Updated 11 months ago
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆61Updated 4 months ago
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆15Updated 4 months ago
- ☆18Updated 5 months ago
- ☆32Updated last month
- [ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"☆58Updated 7 months ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆36Updated last year
- [ArXiv 2024] Denial-of-Service Poisoning Attacks on Large Language Models☆18Updated 6 months ago
- A block pruning framework for LLMs.☆22Updated 10 months ago
- ☆20Updated 5 months ago
- [ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang☆16Updated 2 years ago
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆35Updated 6 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.☆46Updated last month
- Official Pytorch Implementation of Our Paper Accepted at ICLR 2024-- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLM…☆47Updated last year
- Code for Neurips 2024 paper "Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models"☆46Updated 4 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆72Updated 7 months ago
- This is the official code for the paper "Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturba…☆26Updated last month
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆45Updated 7 months ago
- ☆21Updated 4 months ago
- Code for our paper "Defending ChatGPT against Jailbreak Attack via Self-Reminder" in NMI.☆48Updated last year
- The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'☆17Updated 2 months ago
- ☆42Updated last year
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 3 weeks ago
- Official code for SEAL: Steerable Reasoning Calibration of Large Language Models for Free☆22Updated last month
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆76Updated last month
- This is the official code for the paper "Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning" (NeurIPS2024)☆20Updated 8 months ago
- 🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training☆83Updated 5 months ago
- ☆14Updated 7 months ago
- This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large Language Models" (NeurIPS2024)☆42Updated 5 months ago