haizelabs / nyc-ai-readingLinks
nyc is so back
☆18Updated 3 months ago
Alternatives and similar repositories for nyc-ai-reading
Users that are interested in nyc-ai-reading are comparing it to the libraries listed below
Sorting:
- Extract full next-token probabilities via language model APIs☆248Updated last year
- Redwood Research's transformer interpretability tools☆14Updated 3 years ago
- ☆101Updated last week
- METR Task Standard☆160Updated 7 months ago
- Draw more samples☆193Updated last year
- ☆101Updated 5 months ago
- Applying SAEs for fine-grained control☆23Updated 9 months ago
- Inference API for many LLMs and other useful tools for empirical research☆71Updated last week
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28Updated last year
- Notebooks accompanying Anthropic's "Toy Models of Superposition" paper☆129Updated 3 years ago
- seqax = sequence modeling + JAX☆167Updated 2 months ago
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆193Updated last year
- ☆127Updated last year
- Attribution-based Parameter Decomposition☆30Updated 3 months ago
- A domain-specific probabilistic programming language for modeling and inference with language models☆135Updated 4 months ago
- Mechanistic Interpretability Visualizations using React☆289Updated 9 months ago
- Probabilistic programming with large language models☆136Updated 2 months ago
- ☆300Updated last year
- Stochastic Parameter Decomposition☆39Updated this week
- Tools for studying developmental interpretability in neural networks.☆103Updated 3 months ago
- Resources for skilling up in AI alignment research engineering. Covers basics of deep learning, mechanistic interpretability, and RL.☆227Updated last month
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆220Updated 9 months ago
- ☆26Updated 3 months ago
- ☆281Updated last year
- Bootstrapping ARC☆143Updated 10 months ago
- ☆242Updated 11 months ago
- Multiple datasets for ARC (Abstraction and Reasoning Corpus)☆81Updated 5 months ago
- ☆14Updated 10 months ago
- Reverse Engineering the Abstraction and Reasoning Corpus☆305Updated 7 months ago
- ☆69Updated 2 years ago