yoavgur / PISCESLinks
πͺPISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models
β12Updated 7 months ago
Alternatives and similar repositories for PISCES
Users that are interested in PISCES are comparing it to the libraries listed below
Sorting:
- β16Updated 4 months ago
- Find informative examples to efficiently (human)-evaluate NLG models.β17Updated last month
- Measuring the Mixing of Contextual Information in the Transformerβ34Updated 2 years ago
- A software for transferring pre-trained English models to foreign languagesβ19Updated 2 years ago
- Attribute statements generated by LLMs to preceding tokens using attention weights.β21Updated 8 months ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/β26Updated 9 months ago
- The geometry of multilingual language model representations (EMNLP 2022).β22Updated 3 years ago
- β39Updated 4 years ago
- Codes for "Benchmarking the Generation of Fact Checking Explanations"β10Updated last year
- β90Updated 3 years ago
- Landing page for MIB: A Mechanistic Interpretability Benchmarkβ23Updated 4 months ago
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 π πβ15Updated last year
- DecompX: Explaining Transformers Decisions by Propagating Token Decomposition [ACL 2023]β19Updated 6 months ago
- β29Updated last year
- This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".β88Updated 4 years ago
- β113Updated 3 years ago
- β24Updated 4 years ago
- β16Updated 2 years ago
- Debiasing Methods in Natural Language Understanding Make Bias More Accessible:Β Code and Dataβ14Updated 3 years ago
- Evaluate interpretability methods on localizing and disentangling concepts in LLMs.β57Updated 2 months ago
- β18Updated 3 years ago
- Data for evaluating gender bias in coreference resolution systems.β81Updated 6 years ago
- β47Updated last year
- [NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformersβ21Updated 2 years ago
- A Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation, Levy et al., Findings of EMNLP 2021β14Updated 3 years ago
- To analyze and remove gender bias in coreference resolution systemsβ78Updated 8 months ago
- Dataset associated with "BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation" paperβ85Updated 4 years ago
- Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"β12Updated 4 years ago
- Code for preprint: Summarizing Differences between Text Distributions with Natural Languageβ43Updated 2 years ago
- Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.β32Updated 2 years ago