yoavgur / PISCESLinks
πͺPISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models
β12Updated 8 months ago
Alternatives and similar repositories for PISCES
Users that are interested in PISCES are comparing it to the libraries listed below
Sorting:
- Measuring the Mixing of Contextual Information in the Transformerβ34Updated 2 years ago
- Codes for "Benchmarking the Generation of Fact Checking Explanations"β10Updated last year
- Attribute statements generated by LLMs to preceding tokens using attention weights.β21Updated 9 months ago
- β29Updated last year
- Find informative examples to efficiently (human)-evaluate NLG models.β18Updated 2 weeks ago
- DecompX: Explaining Transformers Decisions by Propagating Token Decomposition [ACL 2023]β19Updated 7 months ago
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformersβ21Updated 2 years ago
- β39Updated 4 years ago
- A framework for evaluating Machine Translation models.β12Updated 8 months ago
- A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021β14Updated 3 years ago
- Landing page for MIB: A Mechanistic Interpretability Benchmarkβ24Updated 5 months ago
- A software for transferring pre-trained English models to foreign languagesβ19Updated 2 years ago
- Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"β12Updated 4 years ago
- The geometry of multilingual language model representations (EMNLP 2022).β22Updated 3 years ago
- β90Updated 3 years ago
- Repository for DEMETR: Diagnosing Evaluation Metrics for Translationβ17Updated 3 years ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/β26Updated 10 months ago
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 π πβ15Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.β42Updated 2 years ago
- β20Updated 2 years ago
- Code for "Tracing Knowledge in Language Models Back to the Training Data"β39Updated 3 years ago
- β18Updated 3 years ago
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".β89Updated 4 years ago
- [NeurIPS 2025 MechInterp Workshop - Spotlight] Official implementation of the paper "RelP: Faithful and Efficient Circuit Discovery in Laβ¦β24Updated 3 months ago
- Benchmark API for Multidomain Language Modelingβ25Updated 3 years ago
- β17Updated 5 months ago
- β15Updated 4 years ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Languageβ43Updated 2 years ago
- β16Updated 2 years ago
- β24Updated 4 years ago