Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
☆13Aug 21, 2023Updated 2 years ago
Alternatives and similar repositories for llm-security-challenge
Users that are interested in llm-security-challenge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Whispers in the Machine: Confidentiality in Agentic Systems☆44Apr 20, 2026Updated 2 weeks ago
- Pin files for contextual, codebase-level AI assistance.☆16Jul 11, 2024Updated last year
- Repo for the paper on Escalation Risks of AI systems☆44Apr 12, 2024Updated 2 years ago
- Risks and targets for assessing LLMs & LLM vulnerabilities☆34May 27, 2024Updated last year
- ☆11Sep 7, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆14Mar 31, 2024Updated 2 years ago
- Multiplayer JS game platform☆16Oct 16, 2017Updated 8 years ago
- Example fNIRS BIDS dataset☆15Nov 4, 2022Updated 3 years ago
- 📚📚📚📚📚📚📚📚📚 Reading everything☆15Mar 11, 2026Updated last month
- 🧠 Inspecting complexity and goal-directedness of imagination in an fNIRS BCI system.☆11Aug 26, 2023Updated 2 years ago
- Sample Excel add-in and Python script code to run an agent using LLM from an Excel function☆20Jul 16, 2024Updated last year
- Methods 2: The General Linear Model☆15May 5, 2022Updated 4 years ago
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.☆25Jan 26, 2024Updated 2 years ago
- ☆22Jul 18, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆13Dec 22, 2023Updated 2 years ago
- 👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"☆20Jan 19, 2024Updated 2 years ago
- Agent-Friendly Web Principles☆31Oct 15, 2025Updated 6 months ago
- The following is a simple example of how LLMs and langchain agents can simplify asking questions to understand the security posture of a …☆23Aug 23, 2023Updated 2 years ago
- Website for PauseAI.info☆26Updated this week
- Fine-tuning of transformers for Sentiment Analysis☆19May 25, 2021Updated 4 years ago
- A web service in PHP that "translates" HackNPlan webhook messages to Discord webhook messages.☆16Feb 23, 2023Updated 3 years ago
- LLM security and privacy☆54Oct 15, 2024Updated last year
- Code for the paper "Understanding RL Vision"☆51Apr 2, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- The Happy Faces Benchmark☆15Jul 20, 2023Updated 2 years ago
- A collection of security papers on top-tier publications☆67Apr 16, 2026Updated 3 weeks ago
- LLMs for Wargames☆22Sep 21, 2024Updated last year
- The burp extension to forward the request☆10Oct 21, 2024Updated last year
- A Novel Benchmark evaluating the Deep Capability of Vulnerability Detection with Large Language Models☆34Apr 25, 2025Updated last year
- This is the starter kit for the Trojan Detection Challenge 2023 (LLM Edition), a NeurIPS 2023 competition.☆91May 19, 2024Updated last year
- github信息泄露搜集工具。GSIL升级版,去除发邮件方式,将结果保存在本地☆13Mar 20, 2021Updated 5 years ago
- This repository contains various shell scripts and tips and tricks used for packaging androidtamer packages☆13Jul 10, 2022Updated 3 years ago
- Sentida☆22Dec 14, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [TMLR 2024] On the Adversarial Robustness of Camera-based 3D Object Detection☆31Apr 23, 2024Updated 2 years ago
- Situational Awareness Dataset☆50Dec 14, 2024Updated last year
- ☆10Jun 29, 2020Updated 5 years ago
- Delving into the Realm of LLM Security: An Exploration of Offensive and Defensive Tools, Unveiling Their Present Capabilities.☆168Oct 13, 2023Updated 2 years ago
- Gremlin Documentation and Samples☆46Sep 7, 2021Updated 4 years ago
- ☆13Sep 21, 2025Updated 7 months ago
- Debugger for HTC phones bootloader (HBOOT).☆20Nov 28, 2013Updated 12 years ago