yash-srivastava19 / arrakisView external linksLinks
Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.
☆30Apr 22, 2025Updated 9 months ago
Alternatives and similar repositories for arrakis
Users that are interested in arrakis are comparing it to the libraries listed below
Sorting:
- MishformerLens intends to be a drop-in replacement for TransformerLens that AST patches HuggingFace Transformers rather than implementing…☆10Oct 7, 2024Updated last year
- A library for training crosscoders☆15May 28, 2025Updated 8 months ago
- Tools for optimizing steering vectors in LLMs.☆19Apr 10, 2025Updated 10 months ago
- A benchmark for mechanistic discovery of circuits in Transformers☆16Dec 15, 2024Updated last year
- ☆23Jun 30, 2025Updated 7 months ago
- Engine for collecting, uploading, and downloading model activations☆26Apr 2, 2025Updated 10 months ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- ☆20Apr 10, 2025Updated 10 months ago
- ☆20Feb 17, 2023Updated 3 years ago
- ☆25Apr 18, 2025Updated 10 months ago
- Unified access to Large Language Model modules using NNsight☆88Updated this week
- ☆58Nov 19, 2024Updated last year
- A toolkit that provides a range of model diffing techniques including a UI to visualize them interactively.☆61Updated this week
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆74Jun 23, 2025Updated 7 months ago
- ☆83Feb 25, 2025Updated 11 months ago
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12May 30, 2025Updated 8 months ago
- ☆13Jan 12, 2023Updated 3 years ago
- Code for my NeurIPS 2024 ATTRIB paper titled "Attribution Patching Outperforms Automated Circuit Discovery"☆47May 31, 2024Updated last year
- ☆51Oct 23, 2023Updated 2 years ago
- Open source interpretability artefacts for R1.☆170Apr 21, 2025Updated 9 months ago
- Trains small LMs. Designed for training on SimpleStories☆12Sep 15, 2025Updated 5 months ago
- Geodesic updates for Nickel & Kiela's graph embedding algorithm in hyperbolic space.☆11Jun 6, 2018Updated 7 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 8 months ago
- Online Simultaneous Localization and Mapping in ROS☆11Jan 31, 2019Updated 7 years ago
- ☆82Jan 31, 2026Updated 2 weeks ago
- ☆14Apr 29, 2025Updated 9 months ago
- Residual Quantization Autoencoder, used for interpreting LLMs☆13Jan 1, 2025Updated last year
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch☆10Aug 7, 2024Updated last year
- Reasoning-based Evaluation and Ranking of Translations.☆19Jul 18, 2025Updated 7 months ago
- Official implementation of "Interpreting and Controlling Vision Foundation Models via Text Explanations"☆14May 29, 2024Updated last year
- ☆14Mar 15, 2025Updated 11 months ago
- Code for the AACL 2022 Paper "This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Cli…☆12Nov 18, 2022Updated 3 years ago
- Balaram is an agriculture-based chatbot whic answers questions related to farming practices.☆15May 13, 2021Updated 4 years ago
- ☆13Apr 10, 2025Updated 10 months ago
- Dynamical Systems with JAX☆12Jan 11, 2026Updated last month
- ☆16Dec 10, 2025Updated 2 months ago
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆33Nov 1, 2025Updated 3 months ago
- ☆10Mar 19, 2024Updated last year
- Collection of Jupyter notebooks☆12Jan 9, 2022Updated 4 years ago