vivek3141 / ghostbuster-dataLinks
Data from the paper "Ghostbuster: Detecting Text Ghostwritten by Large Language Models"
☆14Updated last year
Alternatives and similar repositories for ghostbuster-data
Users that are interested in ghostbuster-data are comparing it to the libraries listed below
Sorting:
- Ghostbuster: Detecting Text Ghostwritten by Large Language Models (NAACL 2024)☆177Updated last year
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"☆18Updated 8 months ago
- [NAACL 2024] Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers https://arxiv.org/abs/2307.…☆17Updated 2 years ago
- ☆103Updated last year
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆24Updated 2 years ago
- ☆38Updated 6 months ago
- Code repository for the paper "Mission: Impossible Language Models."☆56Updated 4 months ago
- ☆64Updated last month
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆28Updated 4 years ago
- Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/☆26Updated 11 months ago
- ☆37Updated 2 years ago
- Data and code for the paper "The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"☆21Updated 2 years ago
- Enhaced version of Wikiextrator: A wikipedia dumps extractor☆28Updated 4 months ago
- ☆17Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)☆27Updated last year
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆94Updated 6 months ago
- Data for evaluating gender bias in coreference resolution systems.☆81Updated 6 years ago
- ☆19Updated 4 months ago
- Rust library for indexing and quickly searching large pretraining corpora☆30Updated 3 months ago
- ☆13Updated last year
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑☆15Updated last year
- The geometry of multilingual language model representations (EMNLP 2022).☆22Updated 3 years ago
- ☆37Updated 2 months ago
- Attribute statements generated by LLMs to preceding tokens using attention weights.☆21Updated 9 months ago
- A modular and extensible Python framework, designed to aid in the creation of high-quality, unbiased datasets to build robust models for …☆19Updated 3 months ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Updated last year
- 🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models☆12Updated 8 months ago
- ☆41Updated last year
- Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"☆12Updated 4 years ago
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Updated 2 years ago