To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
☆33May 21, 2025Updated 9 months ago
Alternatives and similar repositories for unthinking_vulnerability
Users that are interested in unthinking_vulnerability are comparing it to the libraries listed below
Sorting:
- This is the official implementation of ICLR 2024 paper "VDC: Versatile Data Cleanser based on Visual-Linguistic Inconsistency by Multimod…☆19Feb 24, 2025Updated last year
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 4 months ago
- ☆14Apr 14, 2025Updated 10 months ago
- [COLING 2025] Official repo of paper: "Not Aligned" is Not "Malicious": Being Careful about Hallucinations of Large Language Models' Jail…☆12Jul 26, 2024Updated last year
- Pytorch implementation of NPAttack☆12Jul 7, 2020Updated 5 years ago
- ☆11Apr 27, 2022Updated 3 years ago
- ☆26Oct 22, 2025Updated 4 months ago
- Control LLM☆22Apr 6, 2025Updated 10 months ago
- ☆49Apr 4, 2025Updated 10 months ago
- Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"☆21Feb 17, 2025Updated last year
- ☆32Oct 13, 2025Updated 4 months ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Data-Efficient Backdoor Attacks☆20Jun 15, 2022Updated 3 years ago
- ☆25Nov 19, 2025Updated 3 months ago
- ☆53May 22, 2025Updated 9 months ago
- Process Orchestration Framework: A camunda 7 fork☆21Updated this week
- Sys2Bench is a benchmarking suite designed to evaluate reasoning and planning capabilities of large language models across algorithmic, l…☆29Mar 5, 2025Updated 11 months ago
- ☆21Mar 17, 2025Updated 11 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 7 months ago
- Official implementation of paper: DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers☆66Aug 25, 2024Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 10 months ago
- ☆29Mar 3, 2021Updated 4 years ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆31Oct 24, 2024Updated last year
- [NeurIPS 2024] Accelerating Greedy Coordinate Gradient and General Prompt Optimization via Probe Sampling☆33Nov 8, 2024Updated last year
- [AAAI'26 Oral] Official Implementation of STAR-1: Safer Alignment of Reasoning LLMs with 1K Data☆33Apr 7, 2025Updated 10 months ago
- ☆39May 17, 2025Updated 9 months ago
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- SDLC Copilot is an Agentic AI system designed to streamline and automate the Software Development Lifecycle (SDLC). From requirement gath…☆23Jun 14, 2025Updated 8 months ago
- ☆35May 21, 2025Updated 9 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆32Aug 5, 2025Updated 6 months ago
- Auditing agents for fine-tuning safety☆18Oct 21, 2025Updated 4 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- Codes for ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding [ICML 2025]]☆45Jul 22, 2025Updated 7 months ago
- A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.☆71May 22, 2025Updated 9 months ago
- (ACL 2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆34May 28, 2025Updated 8 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆40Feb 13, 2025Updated last year
- AgenTracer: A Lightweight Failure Attributor for Agentic Systems☆76Nov 12, 2025Updated 3 months ago
- ☆37Oct 2, 2024Updated last year