inseq-team / inseqLinks

Interpretability for sequence generation models 🐛 🔍

☆441

Alternatives and similar repositories for inseq

Users that are interested in inseq are comparing it to the libraries listed below

Sorting:

AlignmentResearch / tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
☆532Updated 2 months ago
stanfordnlp / pyvene
Stanford NLP Python library for understanding and improving PyTorch models via interventions
☆819Updated last week
interpretingdl / eacl2024_transformer_interpretability_tutorial
Materials for EACL2024 tutorial: Transformer-specific Interpretability
☆60Updated last year
allenai / wimbd
What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets
☆223Updated 11 months ago
facebookresearch / ResponsibleNLP
Repository for research in the field of Responsible NLP at Meta.
☆202Updated 5 months ago
IINemo / lm-polygraph
☆369Updated this week
r-three / t-few
Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"
☆457Updated 2 years ago
evandez / relations
How do transformer LMs encode relations?
☆55Updated last year
mega002 / lm-debugger
The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.
☆179Updated 3 years ago
hendrycks / ethics
Aligning AI With Shared Human Values (ICLR 2021)
☆303Updated 2 years ago
krishnap25 / mauve
Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.
☆298Updated last year
ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models
This repository collects all relevant resources about interpretability in LLMs
☆375Updated 11 months ago
collin-burns / discovering_latent_knowledge
☆279Updated last year
EleutherAI / concept-erasure
Erasing concepts from neural representations with provable guarantees
☆238Updated 8 months ago
fdalvi / NeuroX
A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.
☆106Updated 2 years ago
sileod / tasksource
Datasets collection and preprocessings framework for NLP extreme multitask learning
☆188Updated 3 months ago
g8a9 / ferret
A python package for benchmarking interpretability techniques on Transformers.
☆213Updated last year
MilaNLProc / simple-generation
A python package to run inference with HuggingFace language and vision-language checkpoints wrapping many convenient features.
☆28Updated last year
PAIR-code / interpretability
PAIR.withgoogle.com and friend's work on interpretability methods
☆204Updated last month
craffel / llm-seminar
Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)
☆311Updated 2 years ago
ARBORproject / arborproject.github.io
☆81Updated 7 months ago
ArthurConmy / Automatic-Circuit-Discovery
☆247Updated last year
TransformerLensOrg / CircuitsVis
Mechanistic Interpretability Visualizations using React
☆293Updated 10 months ago
EleutherAI / knowledge-neurons
A library for finding knowledge neurons in pretrained transformer models.
☆159Updated 3 years ago
neulab / knn-transformers
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…
☆280Updated 3 years ago
stanford-crfm / mistral
Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging F…
☆574Updated last year
nostalgebraist / transformer-utils
Utilities for the HuggingFace transformers library
☆72Updated 2 years ago
bigscience-workshop / lm-evaluation-harness
A framework for few-shot evaluation of autoregressive language models.
☆104Updated 2 years ago
davidbau / baukit
☆234Updated last year
ndif-team / nnsight
The nnsight package enables interpreting and manipulating the internals of deep learned models.
☆683Updated last week