A text-based game where language models learn to lie and to detect lies.
☆12Oct 4, 2023Updated 2 years ago
Alternatives and similar repositories for hoodwinked-website
Users that are interested in hoodwinked-website are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Oct 24, 2022Updated 3 years ago
- Reinforcement Learning Replications is a set of Pytorch implementations of reinforcement learning algorithms.☆24Apr 4, 2026Updated last month
- Measuring the situational awareness of language models☆41Feb 12, 2024Updated 2 years ago
- Starter kit and data loading code for the Trojan Detection Challenge NeurIPS 2022 competition☆33Jul 26, 2023Updated 2 years ago
- Contemplative reasoning MCP server — Lotus Sutra wisdom framework with interactive ext-apps journey visualization.☆27Apr 14, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆17Dec 21, 2023Updated 2 years ago
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.