ZiyueWang25 / llm-security-challenge

Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
11Updated last year

Related projects

Alternatives and complementary repositories for llm-security-challenge