google-deepmind / mishax
☆128Updated last week
Alternatives and similar repositories for mishax:
Users that are interested in mishax are comparing it to the libraries listed below
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models …☆166Updated this week
- ☆26Updated last year
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆192Updated 3 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆103Updated 4 months ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 6 months ago
- Code for reproducing our paper "Not All Language Model Features Are Linear"☆73Updated 4 months ago
- Steering vectors for transformer language models in Pytorch / Huggingface☆94Updated last month
- ☆90Updated 2 months ago
- Applying SAEs for fine-grained control☆17Updated 3 months ago
- Open source replication of Anthropic's Crosscoders for Model Diffing☆49Updated 5 months ago
- Erasing concepts from neural representations with provable guarantees☆228Updated 2 months ago
- ☆34Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆144Updated 2 months ago
- ☆33Updated 4 months ago
- 🧠 Starter templates for doing interpretability research☆69Updated last year
- A MAD laboratory to improve AI architecture designs 🧪☆109Updated 3 months ago
- PyTorch library for Active Fine-Tuning☆62Updated last month
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆189Updated 10 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆58Updated 5 months ago
- Sparse and discrete interpretability tool for neural networks☆62Updated last year
- Extract full next-token probabilities via language model APIs☆240Updated last year
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆196Updated this week
- EvaByte: Efficient Byte-level Language Models at Scale☆86Updated 3 weeks ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆26Updated 10 months ago
- ☆71Updated 2 months ago
- Mechanistic Interpretability Visualizations using React☆239Updated 3 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆135Updated last month
- code for training & evaluating Contextual Document Embedding models☆180Updated 3 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆63Updated 2 weeks ago
- A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning☆102Updated this week