jprivera44 / EscalAItionLinks
Repo for the paper on Escalation Risks of AI systems
☆40Updated last year
Alternatives and similar repositories for EscalAItion
Users that are interested in EscalAItion are comparing it to the libraries listed below
Sorting:
- ☆20Updated last year
- ☆73Updated 4 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆61Updated 5 months ago
- ☆71Updated last week
- An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic mult…☆69Updated 2 years ago
- Automating enterprise workflows with multimodal agents☆108Updated 10 months ago
- A repo to evaluate various LLM's chess playing abilities.☆83Updated last year
- Code for "Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs"☆54Updated 5 months ago
- ☆52Updated last year
- An automated tool for discovering insights from research papaer corpora☆138Updated last year
- Problem solving by engaging multiple AI agents in conversation with each other and the user.☆223Updated last year
- [NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking☆268Updated last year
- A virtual environment for developing and evaluating automated scientific discovery agents.☆168Updated 5 months ago
- Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents☆125Updated last year
- General-Sum variant of the game Diplomacy for evaluating AIs.☆29Updated last year
- ☆137Updated 2 weeks ago
- A framework to enable multimodal models to play games on a computer.☆98Updated last year
- An attribution library for LLMs☆42Updated 10 months ago
- Hypothetical Minds is an autonomous LLM-based agent for diverse multi-agent settings, integrating a Theory of Mind module Theory of Mind …☆33Updated last year
- A benchmark for evaluating learning agents based on just language feedback☆86Updated 2 months ago
- How to create rational LLM-based agents? Using game-theoretic workflows!☆73Updated 2 months ago
- General multi-task deep RL Agent☆184Updated last year
- A preprint version of our recent research on the capability of frontier AI systems to do self-replication☆59Updated 7 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆59Updated 5 months ago
- ☆219Updated 2 years ago
- ☆29Updated 11 months ago
- ☆99Updated 4 months ago
- Evaluating LLMs with CommonGen-Lite☆90Updated last year
- A framework for orchestrating AI agents using a mermaid graph☆77Updated last year
- Repository for the paper Stream of Search: Learning to Search in Language☆150Updated 6 months ago