ApolloResearch / sampleLinks
Repository with sample code using Apollo's suggested engineering practices
ā15Updated last year
Alternatives and similar repositories for sample
Users that are interested in sample are comparing it to the libraries listed below
Sorting:
- Tools for studying developmental interpretability in neural networks.ā122Updated last week
- š§ Starter templates for doing interpretability researchā76Updated 2 years ago
- we got you broā37Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paperā132Updated 3 years ago
- Decoder only transformer, built from scratch with PyTorchā32Updated 2 years ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.ā236Updated 5 months ago
- ā150Updated 4 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffingā63Updated last year
- Open source interpretability artefacts for R1.ā165Updated 8 months ago
- ā59Updated this week
- Redwood Research's transformer interpretability toolsā14Updated 3 years ago
- Attribution-based Parameter Decompositionā33Updated 7 months ago
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models ā¦ā235Updated 3 weeks ago
- ā17Updated 3 weeks ago
- Sparse Autoencoder Training Libraryā56Updated 8 months ago
- Inference API for many LLMs and other useful tools for empirical researchā95Updated 2 weeks ago
- ā29Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).ā236Updated last year
- epsilon machines and transformers!ā34Updated 6 months ago
- Machine Learning for Alignment Bootcampā81Updated 3 years ago
- METR Task Standardā169Updated 11 months ago
- Mechanistic Interpretability Visualizations using Reactā307Updated last year
- ā36Updated last year
- ā132Updated 2 years ago
- ā58Updated last year
- ā262Updated last year
- ā83Updated 3 weeks ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from eā¦ā28Updated last year
- ā31Updated 6 months ago
- ā104Updated 5 months ago