A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
☆106Oct 4, 2023Updated 2 years ago
Alternatives and similar repositories for NeuroX
Users that are interested in NeuroX are comparing it to the libraries listed below
Sorting:
- Mechanistic Interpretability for Transformer Models☆53Jun 1, 2022Updated 3 years ago
- This is a repository with the code for the EMNLP 2020 paper "Information-Theoretic Probing with Minimum Description Length"☆71Aug 20, 2024Updated last year
- Explicit Alignment Objectives for Multilingual Bidirectional Encoders☆14Apr 14, 2021Updated 4 years ago
- Making a bridge between NLP models and Brain data☆19Jun 3, 2020Updated 5 years ago
- ☆38Apr 23, 2019Updated 6 years ago
- ☆15Apr 10, 2018Updated 7 years ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆26Feb 16, 2026Updated 2 weeks ago
- Landing page for MIB: A Mechanistic Interpretability Benchmark☆24Aug 15, 2025Updated 6 months ago
- This is the official implementation for the paper "Learning to Scaffold: Optimizing Model Explanations for Teaching"☆20May 19, 2022Updated 3 years ago
- Teaching Models to Express Their Uncertainty in Words☆39May 26, 2022Updated 3 years ago
- [ACL 2023] Counterspeeches up my sleeve! Intent Distribution Learning and Persistent Fusion for Intent-Conditioned Counterspeech Generati…☆10Sep 23, 2023Updated 2 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 9 months ago
- Code for the NAACL 2024 HCI+NLP Workshop paper "LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tool…☆13Mar 24, 2024Updated last year
- [Kauf & Ivanova, ACL 2023] A Better Way to Do Masked Language Model Scoring☆10Dec 1, 2023Updated 2 years ago
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆13Mar 23, 2025Updated 11 months ago
- Tools for understanding how transformer predictions are built layer-by-layer☆570Aug 7, 2025Updated 7 months ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆28May 23, 2024Updated last year
- Keeping language models honest by directly eliciting knowledge encoded in their activations.☆217Updated this week
- A Dataset and Results for Classifying Emotions Across Languages☆10Jun 20, 2021Updated 4 years ago
- ☆11Dec 1, 2020Updated 5 years ago
- A multi-species, multi-ephys format histology processing and probe alignment pipeline☆13Mar 20, 2024Updated last year
- DreamGaussian with 2D-GS☆12Oct 10, 2024Updated last year
- A quick way to get started with Transformer Lens☆14Dec 13, 2023Updated 2 years ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- ☆15Apr 20, 2018Updated 7 years ago
- Dual optimization to learn laplacian eigenpairs in arbitrary spaces☆16Dec 18, 2024Updated last year
- PHP low-level client for Vespa. https://vespa.ai/☆17Jan 22, 2026Updated last month
- ☆10Jul 27, 2018Updated 7 years ago
- Code for Analyzing Redundancy in Pretrained Transformer Models accepted at EMNLP 2020☆14Oct 6, 2020Updated 5 years ago
- A2T: Towards Improving Adversarial Training of NLP Models (EMNLP 2021 Findings)☆27Sep 12, 2021Updated 4 years ago
- Repository describing example random control tasks for designing and interpreting neural probes☆32Jun 21, 2022Updated 3 years ago
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆866Updated this week
- Getting interpretable dimensions in word embedding spaces.☆15Jul 6, 2023Updated 2 years ago
- This is the code for our ACL 2021 paper entitled eMLM: A New Pre-training Objective for Emotion Related Tasks☆15Sep 7, 2022Updated 3 years ago
- X2Static embeddings☆15Jul 15, 2021Updated 4 years ago
- Debiasing Methods in Natural Language Understanding Make Bias More Accessible: Code and Data☆14Apr 24, 2022Updated 3 years ago
- ☆17Dec 11, 2024Updated last year
- ☆15Jul 1, 2020Updated 5 years ago
- ☆19Sep 16, 2025Updated 5 months ago