acsresearch / interlabLinks
☆20Updated 10 months ago
Alternatives and similar repositories for interlab
Users that are interested in interlab are comparing it to the libraries listed below
Sorting:
- ☆15Updated 3 weeks ago
- ☆54Updated 8 months ago
- Redwood Research's transformer interpretability tools☆15Updated 3 years ago
- Machine Learning for Alignment Bootcamp☆73Updated 3 years ago
- A TinyStories LM with SAEs and transcoders☆11Updated 2 months ago
- ControlArena is a suite of realistic settings, mimicking complex deployment environments, for running control evaluations. This is an alp…☆61Updated this week
- A dataset of alignment research and code to reproduce it☆77Updated last year
- Mechanistic Interpretability for Transformer Models☆51Updated 3 years ago
- (Model-written) LLM evals library☆18Updated 5 months ago
- METR Task Standard☆148Updated 4 months ago
- Stampy's copy of Alignment Research Dataset scraper☆12Updated 2 weeks ago
- Tools for studying developmental interpretability in neural networks.☆91Updated 4 months ago
- ☆21Updated 8 months ago
- Code for reproducing the results from the paper Avoiding Side Effects in Complex Environments☆12Updated 4 years ago
- Interpreting how transformers simulate agents performing RL tasks☆83Updated last year
- General-Sum variant of the game Diplomacy for evaluating AIs.☆29Updated last year
- Benchmark environments for reward modelling and imitation learning algorithms.☆46Updated last year
- we got you bro☆35Updated 10 months ago
- A toolbox with the goal of speeding up research on bargaining in MARL (cooperation problems in MARL).☆32Updated 2 years ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆50Updated 3 months ago
- Inference API for many LLMs and other useful tools for empirical research☆48Updated last week
- ☆96Updated 2 months ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆94Updated last week
- ☆132Updated 7 months ago
- Psych 290Q S23 @ UC Berkeley: Large Language Models and Cognitive Science☆18Updated last year
- Formal Contracts for Multi-Agent Reinforcement Learning☆17Updated last year
- Language-annotated Abstraction and Reasoning Corpus☆86Updated 2 years ago
- ☆19Updated 2 years ago
- A python sdk for LLM finetuning and inference on runpod infrastructure☆11Updated this week
- A collection of different ways to implement accessing and modifying internal model activations for LLMs☆16Updated 7 months ago