inseq-team / inseq
Interpretability for sequence generation models π π
β413Updated 2 weeks ago
Alternatives and similar repositories for inseq
Users that are interested in inseq are comparing it to the libraries listed below
Sorting:
- β223Updated 7 months ago
- Tools for understanding how transformer predictions are built layer-by-layerβ490Updated 11 months ago
- This repository collects all relevant resources about interpretability in LLMsβ343Updated 6 months ago
- Mechanistic Interpretability Visualizations using Reactβ244Updated 4 months ago
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasetsβ218Updated 5 months ago
- The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.β177Updated 2 years ago
- Using sparse coding to find distributed representations used by neural networks.β242Updated last year
- Stanford NLP Python library for understanding and improving PyTorch models via interventionsβ740Updated last week
- Sparsify transformers with SAEs and transcodersβ524Updated this week
- Materials for EACL2024 tutorial: Transformer-specific Interpretabilityβ52Updated last year
- A python package for benchmarking interpretability techniques on Transformers.β212Updated 7 months ago
- β206Updated last year
- Repository for research in the field of Responsible NLP at Meta.β199Updated 5 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learningβ180Updated 4 months ago
- Erasing concepts from neural representations with provable guaranteesβ228Updated 3 months ago
- Utilities for the HuggingFace transformers libraryβ67Updated 2 years ago
- PAIR.withgoogle.com and friend's work on interpretability methodsβ186Updated last week
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).β199Updated 4 months ago
- PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including anβ¦β272Updated 2 years ago
- β114Updated 9 months ago
- Training Sparse Autoencoders on Language Modelsβ761Updated this week
- Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models β¦β172Updated last week
- Multilingual Large Language Models Evaluation Benchmarkβ123Updated 8 months ago
- The nnsight package enables interpreting and manipulating the internals of deep learned models.β559Updated this week
- β93Updated last month
- Steering Llama 2 with Contrastive Activation Additionβ148Updated 11 months ago
- β265Updated last year
- Aligning AI With Shared Human Values (ICLR 2021)β286Updated 2 years ago
- β71Updated 2 months ago
- Sparse Autoencoder for Mechanistic Interpretabilityβ243Updated 9 months ago