Code for LLM_Catastrophic_Forgetting via SAM.
☆11Jun 7, 2024Updated last year
Alternatives and similar repositories for LLM_CatastrophicForgetting
Users that are interested in LLM_CatastrophicForgetting are comparing it to the libraries listed below
Sorting:
- The code for the paper "Dual Mutual Information Constraints for Discriminative Clustering"☆23Aug 22, 2024Updated last year
- Code for Retrieval-Augmented Perception (ICML 2025)☆69Updated this week
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆18Jan 14, 2025Updated last year
- PyTorch implementation of Deep Bilateral Learning for Real Time Image Enhancement.☆15Dec 9, 2018Updated 7 years ago
- Implementation of Weakly Supervised Deep Detection Networks with PyTorch☆12Dec 7, 2022Updated 3 years ago
- PyTorch Implementation for SeCu☆18Nov 24, 2023Updated 2 years ago
- Code for safety test in "Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates"☆22Sep 21, 2025Updated 6 months ago
- [TMLR 2024] Revisiting Random Weight Perturbation for Efficiently Improving Generalization☆12Oct 18, 2024Updated last year
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆64Dec 8, 2024Updated last year
- [CVPR 2024 Highlight] - Stationary Representations: Optimally Approximating Compatibility and Implications for Improved Model Replacement…☆13Oct 21, 2024Updated last year
- Repo for assignments and term projects☆11Jul 21, 2019Updated 6 years ago
- 🚀enhanced GRPO with more verifiable rewards and real-time evaluators☆37Jan 27, 2026Updated last month
- Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …☆82Updated this week
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆67Jun 9, 2025Updated 9 months ago
- ☆14Feb 26, 2025Updated last year
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated last month
- Code for the paper "Mehta, S. V., Patil, D., Chandar, S., & Strubell, E. (2023). An Empirical Investigation of the Role of Pre-training i…☆17Mar 18, 2024Updated 2 years ago
- Learnable drift compensation (LDC) reduces semantic drift in continual learning using a trainable projector to map between tasks.☆19Nov 13, 2024Updated last year
- Identification of the Adversary from a Single Adversarial Example (ICML 2023)☆10Jul 15, 2024Updated last year
- [ICLR 2025] On Evluating the Durability of Safegurads for Open-Weight LLMs☆13Jun 20, 2025Updated 9 months ago
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 3 months ago
- ☆19May 14, 2025Updated 10 months ago
- ☆17Dec 21, 2023Updated 2 years ago
- Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"☆14Mar 28, 2024Updated last year
- Some resources about Ray Forward Meetup☆30Dec 25, 2025Updated 2 months ago
- 👋 Lab assignments for Introduction Course of Dian AI Group.☆21Jan 26, 2022Updated 4 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆37Aug 29, 2025Updated 6 months ago
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated last month
- [NeurIPS'24] "NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes"☆10Sep 18, 2025Updated 6 months ago
- Providing the answer to "How to do patching on all available SAEs on GPT-2?". It is an official repository of the implementation of the p…☆13Jan 26, 2025Updated last year
- ☆17Dec 11, 2022Updated 3 years ago
- ☆26Oct 6, 2024Updated last year
- 🎁[ChatGPT4NLU] A Comparative Study on ChatGPT and Fine-tuned BERT☆192Apr 17, 2023Updated 2 years ago
- [NeurIPS 2023] Official repository for "Distilling Out-of-Distribution Robustness from Vision-Language Foundation Models"☆11Jun 18, 2024Updated last year
- Demo code for the paper: One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features☆12Nov 30, 2023Updated 2 years ago
- ☆24Dec 13, 2020Updated 5 years ago
- ☆11Jun 20, 2023Updated 2 years ago
- Code for CVPR24 Paper - Resource-Efficient Transformer Pruning for Finetuning of Large Models☆12Oct 31, 2025Updated 4 months ago
- ☆16Feb 8, 2024Updated 2 years ago