haonan3 / V1
V1: Toward Multimodal Reasoning by Designing Auxiliary Task
☆20Updated last week
Alternatives and similar repositories for V1:
Users that are interested in V1 are comparing it to the libraries listed below
- [NeurIPS 2024] "Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?"☆35Updated 2 months ago
- Official Repository for The Paper: Safety Alignment Should Be Made More Than Just a Few Tokens Deep☆82Updated 8 months ago
- The official repository for paper "MLLM-Protector: Ensuring MLLM’s Safety without Hurting Performance"☆34Updated 11 months ago
- The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?☆27Updated 4 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆13Updated 9 months ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆54Updated 5 months ago
- ☆20Updated last week
- [ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"☆78Updated last year
- ☆33Updated 5 months ago
- Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)☆59Updated 2 months ago
- "In-Context Unlearning: Language Models as Few Shot Unlearners". Martin Pawelczyk, Seth Neel* and Himabindu Lakkaraju*; ICML 2024.☆24Updated last year
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆25Updated last month
- An implementation for MLLM oversensitivity evaluation☆10Updated 4 months ago
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆19Updated last year
- RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024☆69Updated 5 months ago
- Official repo for EMNLP'24 paper "SOUL: Unlocking the Power of Second-Order Optimization for LLM Unlearning"☆22Updated 5 months ago
- ☆25Updated 10 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆55Updated last month
- Codebase for decoding compressed trust.☆23Updated 10 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆74Updated 5 months ago
- Official Code and data for ACL 2024 finding, "An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models"☆16Updated 4 months ago
- The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"☆13Updated last week
- ☆19Updated 3 weeks ago
- Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"☆28Updated 3 weeks ago
- ☆16Updated last week
- [ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications☆73Updated 3 weeks ago
- Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as …☆53Updated this week
- ☆50Updated 8 months ago
- ECSO (Make MLLM safe without neither training nor any external models!) (https://arxiv.org/abs/2403.09572)☆23Updated 4 months ago
- (ICLR2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆26Updated 2 weeks ago