☆27Jun 28, 2025Updated 8 months ago
Alternatives and similar repositories for G-safeguard
Users that are interested in G-safeguard are comparing it to the libraries listed below
Sorting:
- ☆37Oct 15, 2024Updated last year
- TI-RSLK_小车走迷宫☆14Apr 27, 2019Updated 6 years ago
- ☆24Jul 27, 2024Updated last year
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆39Feb 14, 2026Updated 3 weeks ago
- FGLA: Fast Generation-Based Gradient Leakage Attacks against Highly Compressed Gradients☆14Dec 20, 2022Updated 3 years ago
- An open-source non-official community implementation of the model from the paper: Surgical Robot Transformer (SRT): Imitation Learning fo…☆11Feb 9, 2026Updated last month
- ☆118Jul 2, 2024Updated last year
- Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics☆70Jan 27, 2026Updated last month
- Dataset for training EEG IC classifiers.☆14Aug 29, 2021Updated 4 years ago
- An open source community who focuses on developing and publishing elegant algorithms, models and tools for science big data mining and kn…☆10Jul 27, 2019Updated 6 years ago
- ☆14Oct 19, 2025Updated 4 months ago
- This repository contains reference implementation for multi-LLM ToM paper (accepted to EMNLP 2023), Theory of Mind for Multi-Agent Collab…☆18Jun 11, 2024Updated last year
- ☆10Oct 20, 2022Updated 3 years ago
- Camouflage YOLO - (CAMOLO) trains adversarial patches to confuse the YOLO family of object detectors.☆12Oct 20, 2022Updated 3 years ago
- The code for paper "ProQA: Structural Prompt-based Pre-training for Unified Question Answering"☆11Feb 7, 2023Updated 3 years ago
- ☆14Mar 9, 2025Updated last year
- Self-Teaching Notes on Gradient Leakage Attacks against GPT-2 models.☆14Mar 18, 2024Updated last year
- ☆11Apr 12, 2024Updated last year
- Working with images in frequency space☆10Nov 5, 2020Updated 5 years ago
- Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion☆11Apr 1, 2024Updated last year
- A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning.☆19Aug 23, 2025Updated 6 months ago
- Code for our NeurIPS 2024 paper Improved Generation of Adversarial Examples Against Safety-aligned LLMs☆12Nov 7, 2024Updated last year
- 🌟 SwarmAgent: A framework for simulating social group dynamics using multi-agent collaboration, aiding insights into collective behavior…☆12Dec 5, 2023Updated 2 years ago
- Reproduction of Curiosity-driven Exploration by Self-supervised Prediction in PyTorch☆13Jun 10, 2019Updated 6 years ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks☆40Feb 11, 2026Updated 3 weeks ago
- Towards a Mechanistic Understanding of Large Reasoning Models: A Survey of Training, Inference, and Failures☆30Jan 29, 2026Updated last month
- ☆76Dec 5, 2024Updated last year
- ☆14Sep 17, 2024Updated last year
- CMU RavenClaw对话管理☆12Dec 13, 2017Updated 8 years ago
- ☆14Apr 6, 2025Updated 11 months ago
- Code for “SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation(ICLR 2025)”☆24Oct 23, 2025Updated 4 months ago
- Focused Papers, Delivered Simply :)☆51Dec 25, 2025Updated 2 months ago
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Aug 19, 2022Updated 3 years ago
- Code and full version of the paper "Hijacking Attacks against Neural Network by Analyzing Training Data"☆14Feb 28, 2024Updated 2 years ago
- Code of paper: xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking"☆17Feb 17, 2026Updated 2 weeks ago
- ☆16Oct 18, 2023Updated 2 years ago
- Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"☆13May 8, 2023Updated 2 years ago
- ☆12Dec 23, 2020Updated 5 years ago