chentong0 / copy-benchLinks

CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation

☆15

Alternatives and similar repositories for copy-bench

Users that are interested in copy-bench are comparing it to the libraries listed below

Sorting:

balevinstein / Probes
☆57Updated 2 years ago
yikee / Knowledge_Conflict
Resolving Knowledge Conflicts in Large Language Models, COLM 2024
☆18Updated last month
yikee / ScienceMeter
ScienceMeter: Tracking Scientific Knowledge Updates in Language Models
☆16Updated 4 months ago
joeljang / knowledge-unlearning
[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
☆84Updated last year
yuzhaouoe / SAE-based-representation-engineering
[NAACL'25 Oral] Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
☆67Updated 11 months ago
BunsenFeng / AbstainQA
AbstainQA, ACL 2024
☆28Updated last year
allenai / noncompliance
This repository contains data, code and models for contextual noncompliance.
☆24Updated last year
dannyallover / overthinking_the_truth
☆29Updated last year
explanare / ravel
Evaluate interpretability methods on localizing and disentangling concepts in LLMs.
☆56Updated 3 weeks ago
Betswish / MIRAGE
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
☆25Updated 8 months ago
Thartvigsen / GRACE
[NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors
☆82Updated 10 months ago
mega002 / ff-layers
The accompanying code for "Transformer Feed-Forward Layers Are Key-Value Memories". Mor Geva, Roei Schuster, Jonathan Berant, and Omer Le…
☆99Updated 4 years ago
Nanami18 / Snowballed_Hallucination
☆44Updated last year
ruiqi-zhong / nlparam
Augmenting Statistical Models with Natural Language Parameters
☆29Updated last year
eric-mitchell / serac
Semi-Parametric Editing with a Retrieval-Augmented Counterfactual Model
☆69Updated 3 years ago
hkust-nlp / felm
Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
☆61Updated last year
epfl-dlab / llm-latent-language
Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".
☆80Updated last year
fc2869 / lo-fit
LoFiT: Localized Fine-tuning on LLM Representations
☆43Updated 10 months ago
ajyl / dpo_toxic
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.
☆84Updated 8 months ago
asaparov / prontoqa
Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.
☆151Updated 2 months ago
saprmarks / geometry-of-truth
☆94Updated last year
McGill-NLP / bias-bench
ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.
☆150Updated 3 months ago
google / belief-localization
This repository includes code for the paper "Does Localization Inform Editing? Surprising Differences in Where Knowledge Is Stored vs. Ca…
☆61Updated 2 years ago
zthang / Focus
☆21Updated last year
yihuaihong / ConceptVectors
[EMNLP 2025 Main] ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"
☆38Updated 3 months ago
boyiwei / CoTaEval
[NeurIPS 2024 D&B] Evaluating Copyright Takedown Methods for Language Models
☆17Updated last year
jaehunjung1 / Maieutic-Prompting
☆50Updated 2 years ago
liujch1998 / memo-trap
☆22Updated 2 years ago
launchnlp / LitCab
☆25Updated 5 months ago
EmpathYang / ADEPT
Source code and data for ADEPT: A DEbiasing PrompT Framework (AAAI-23).
☆15Updated 11 months ago