jprivera44 / EscalAItion
Repo for the paper on Escalation Risks of AI systems
☆36Updated 10 months ago
Alternatives and similar repositories for EscalAItion:
Users that are interested in EscalAItion are comparing it to the libraries listed below
- ☆50Updated 4 months ago
- Documentation for dynamic machine learning systems.☆29Updated 5 months ago
- Demo of using ChatGPT API for language learning☆12Updated last year
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆50Updated 2 weeks ago
- Interpreting how transformers simulate agents performing RL tasks☆77Updated last year
- An attribution library for LLMs☆37Updated 5 months ago
- Measuring the situational awareness of language models☆34Updated last year
- ☆61Updated 3 weeks ago
- ☆18Updated 7 months ago
- Explainable Reinforcement Learning (XRL) Resources☆37Updated 4 months ago
- A benchmark for evaluating learning agents based on just language feedback☆66Updated 4 months ago
- ☆14Updated 4 months ago
- A dataset of alignment research and code to reproduce it☆73Updated last year
- ☆48Updated 3 months ago
- MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. …☆13Updated 6 months ago
- ☆48Updated last year
- ☆17Updated last week
- A text-based game where language models learn to lie and to detect lies.☆12Updated last year
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- ☆32Updated last year
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 8 months ago
- Repo to reproduce the First-Explore paper results☆37Updated last month
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆14Updated 11 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 3 months ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆48Updated 4 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆18Updated 5 months ago
- LLM Optimize is a proof-of-concept library for doing LLM (large language model) guided blackbox optimization.☆53Updated last year
- General-Sum variant of the game Diplomacy for evaluating AIs.☆29Updated 10 months ago