inseq-team / inseqLinks
Interpretability for sequence generation models 🐛 🔍
☆454Updated 3 weeks ago
Alternatives and similar repositories for inseq
Users that are interested in inseq are comparing it to the libraries listed below
Sorting:
- Tools for understanding how transformer predictions are built layer-by-layer☆565Updated 5 months ago
- Materials for EACL2024 tutorial: Transformer-specific Interpretability☆63Updated last year
- ☆429Updated this week
- Stanford NLP Python library for understanding and improving PyTorch models via interventions☆854Updated this week
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆226Updated last year
- Repository for research in the field of Responsible NLP at Meta.☆205Updated last week
- The official code of LM-Debugger, an interactive tool for inspection and intervention in transformer-based language models.☆181Updated 3 years ago
- Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"☆457Updated 2 years ago
- How do transformer LMs encode relations?☆55Updated last year
- ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.☆153Updated 5 months ago
- Package to compute Mauve, a similarity score between neural text and human text. Install with `pip install mauve-text`.☆307Updated last year
- A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.☆106Updated 2 years ago
- Erasing concepts from neural representations with provable guarantees☆242Updated last year
- This repository collects all relevant resources about interpretability in LLMs☆390Updated last year
- Aligning AI With Shared Human Values (ICLR 2021)☆314Updated 2 years ago
- ☆284Updated last year
- PAIR.withgoogle.com and friend's work on interpretability methods☆219Updated last week
- ☆244Updated last year
- StereoSet: Measuring stereotypical bias in pretrained language models☆196Updated 3 years ago
- A framework for few-shot evaluation of autoregressive language models.☆106Updated 2 years ago
- Mechanistic Interpretability Visualizations using React☆318Updated last year
- Utilities for the HuggingFace transformers library☆74Updated 3 years ago
- ☆265Updated last year
- Locating and editing factual associations in GPT (NeurIPS 2022)☆719Updated last year
- Seminar on Large Language Models (COMP790-101 at UNC Chapel Hill, Fall 2022)☆313Updated 3 years ago
- A python package for benchmarking interpretability techniques on Transformers.☆215Updated last year
- ☆65Updated 2 years ago
- ☆83Updated 11 months ago
- A library for finding knowledge neurons in pretrained transformer models.☆159Updated 3 years ago
- Sparse probing paper full code.☆66Updated 2 years ago