METR / task-templateLinks

☆9

Alternatives and similar repositories for task-template

Users that are interested in task-template are comparing it to the libraries listed below

Sorting:

METR / vivaria
Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
☆103Updated last week
METR / task-standard
METR Task Standard
☆156Updated 6 months ago
UKGovernmentBEIS / control-arena
ControlArena is a collection of settings, model organisms and protocols - for running control experiments.
☆80Updated last week
timaeus-research / devinterp
Tools for studying developmental interpretability in neural networks.
☆100Updated last month
redwoodresearch / mlab
Machine Learning for Alignment Bootcamp
☆76Updated 3 years ago
alignedai / HappyFaces
The Happy Faces Benchmark
☆15Updated 2 years ago
anthropics / PySvelte
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
☆194Updated 3 years ago
callummcdougall / sae_visualizer
☆28Updated last year
moirage / alignment-research-dataset
A dataset of alignment research and code to reproduce it
☆77Updated 2 years ago
thestephencasper / everything-you-need
we got you bro
☆36Updated last year
METR / public-tasks
☆99Updated 4 months ago
LRudL / evalugator
(Model-written) LLM evals library
☆18Updated 7 months ago
Mech-Interp / PySvelte
A library for bridging Python and HTML/Javascript (via Svelte) for creating interactive visualizations
☆14Updated last year
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆128Updated 2 years ago
nickkeesG / Pantheon
Experimental LLM interface exploring new ways to use AI to improve human thinking
☆18Updated 5 months ago
apartresearch / interpretability-starter
🧠 Starter templates for doing interpretability research
☆73Updated 2 years ago
redwoodresearch / interp
Redwood Research's transformer interpretability tools
☆14Updated 3 years ago
quantified-uncertainty / ai-safety-papers
☆21Updated 3 years ago
EleutherAI / elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆209Updated last week
apple / ml-np-rasp
☆19Updated last year
redwoodresearch / rust_circuit_public
☆63Updated 2 years ago
TomFrederik / unseal
Mechanistic Interpretability for Transformer Models
☆51Updated 3 years ago
jessicarumbelow / Backwards
☆84Updated last year
AsaCooperStickland / situational-awareness-evals
Measuring the situational awareness of language models
☆37Updated last year
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆207Updated 7 months ago
google-deepmind / mishax
☆136Updated 4 months ago
callummcdougall / ARENA_2.0
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆220Updated last year
DAIOS-AI / mindscript
A programming language for formal/informal computation.
☆41Updated last week
Kiv / fancy_einsum
Einsum with einops style variable names
☆17Updated last year
mukobi / welfare-diplomacy
General-Sum variant of the game Diplomacy for evaluating AIs.
☆29Updated last year