haizelabs / nyc-ai-readingLinks

nyc is so back

☆18

Alternatives and similar repositories for nyc-ai-reading

Users that are interested in nyc-ai-reading are comparing it to the libraries listed below

Sorting:

METR / RE-Bench
☆117Updated last month
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆247Updated last year
aypan17 / machiavelli
☆141Updated 4 months ago
anthropics / evals
☆310Updated last year
METR / task-standard
METR Task Standard
☆167Updated 9 months ago
google-research / cascades
Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…
☆215Updated 5 months ago
TransformerLensOrg / CircuitsVis
Mechanistic Interpretability Visualizations using React
☆301Updated 11 months ago
neelnanda-io / 1L-Sparse-Autoencoder
☆132Updated 2 years ago
rgreenblatt / arc_draw_more_samples_pub
Draw more samples
☆195Updated last year
LeonGuertler / UnstableBaselines
☆106Updated last month
srush / GPTWorld
A puzzle to learn about prompting
☆135Updated 2 years ago
goodfire-ai / spd
Stochastic Parameter Decomposition
☆51Updated last week
callummcdougall / ARENA_2.0
Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.
☆232Updated 3 months ago
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆227Updated 11 months ago
timaeus-research / devinterp
Tools for studying developmental interpretability in neural networks.
☆114Updated 4 months ago
collin-burns / discovering_latent_knowledge
☆282Updated last year
srush / raspy
An interactive exploration of Transformer programming.
☆271Updated 2 years ago
anthropics / toy-models-of-superposition
Notebooks accompanying Anthropic's "Toy Models of Superposition" paper
☆130Updated 3 years ago
ArthurConmy / Automatic-Circuit-Discovery
☆253Updated last year
princeton-nlp / USACO
Can Language Models Solve Olympiad Programming?
☆122Updated 10 months ago
TransluceAI / docent
☆63Updated last month
probcomp / LLaMPPL
A domain-specific probabilistic programming language for modeling and inference with language models
☆137Updated 6 months ago
r-three / git-theta
git extension for {collaborative, communal, continual} model development
☆215Updated last year
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆238Updated 9 months ago
redwoodresearch / interp
Redwood Research's transformer interpretability tools
☆14Updated 3 years ago
mechanistic-interpretability-grokking / progress-measures-paper
☆70Updated 3 years ago
UFO-101 / auto-circuit
A library for efficient patching and automatic circuit discovery.
☆80Updated 4 months ago
EleutherAI / elk
Keeping language models honest by directly eliciting knowledge encoded in their activations.
☆212Updated last week
safety-research / safety-tooling
Inference API for many LLMs and other useful tools for empirical research
☆80Updated last week
likenneth / othello_world
Emergent world representations: Exploring a sequence model trained on a synthetic task
☆191Updated 2 years ago