[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
☆190Mar 22, 2024Updated last year
Alternatives and similar repositories for ToolEmu
Users that are interested in ToolEmu are comparing it to the libraries listed below
Sorting:
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆99Jan 11, 2026Updated last month
- See also APPL: https://github.com/appl-team/appl that improves this project. A Python package for writing Language Models prompts in a ne…☆36Oct 2, 2023Updated 2 years ago
- ☆37Oct 15, 2024Updated last year
- ☆70Feb 4, 2024Updated 2 years ago
- ☆118Jul 2, 2024Updated last year
- ☆21Jun 22, 2025Updated 8 months ago
- [ICLR 2025] Official Repository for "Tamper-Resistant Safeguards for Open-Weight LLMs"☆66Jun 9, 2025Updated 8 months ago
- ☆35May 9, 2025Updated 9 months ago
- ☆13Oct 21, 2021Updated 4 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆241May 5, 2024Updated last year
- ☆178Oct 31, 2025Updated 4 months ago
- [NeurIPS'24] RedCode: Risky Code Execution and Generation Benchmark for Code Agents☆66Nov 14, 2025Updated 3 months ago
- Repo for the research paper "SecAlign: Defending Against Prompt Injection with Preference Optimization"☆87Jul 24, 2025Updated 7 months ago
- A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)☆3,187Feb 8, 2026Updated 3 weeks ago
- ☆16Apr 9, 2021Updated 4 years ago
- A data construction and evaluation framework to quantify privacy norm awareness of language models (LMs) and emerging privacy risk of LM …☆43Mar 4, 2025Updated last year
- Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.☆25Mar 30, 2025Updated 11 months ago
- ☆23Oct 25, 2024Updated last year
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆38Jun 23, 2025Updated 8 months ago
- Code for experiments on self-prediction as a way to measure introspection in LLMs☆16Dec 10, 2024Updated last year
- [EMNLP 2025 Oral] IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents☆16Sep 16, 2025Updated 5 months ago
- Code&Data for the paper "Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents" [NeurIPS 2024]☆109Sep 27, 2024Updated last year
- ☆48Sep 29, 2024Updated last year
- TrustAgent: Towards Safe and Trustworthy LLM-based Agents☆56Feb 7, 2025Updated last year
- ☆24Dec 8, 2024Updated last year
- ☆23Oct 11, 2024Updated last year
- Official repo for GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts☆568Updated this week
- Datasets for compositional learning☆11Nov 28, 2018Updated 7 years ago
- ☆14Mar 9, 2025Updated 11 months ago
- This is anonymous repository for submitting our work to a conference☆14Dec 17, 2024Updated last year
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- The official source code for [2026 ICLR] "IR-Agent: Expert-Inspired LLM Agents for Structure Elucidation from Infrared Spectra"☆11Feb 25, 2026Updated last week
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 7 months ago
- Official PyTorch implementation of "MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks"☆12Dec 4, 2025Updated 3 months ago
- ☆48Feb 8, 2025Updated last year
- ☆34Mar 6, 2025Updated 11 months ago
- A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.☆443Feb 3, 2026Updated last month
- TensorFlow based implementation of Fully-Convolutional Network☆12Jan 20, 2019Updated 7 years ago
- ☆12Nov 1, 2024Updated last year