vivek3141 / ghostbuster-dataLinks
Data from the paper "Ghostbuster: Detecting Text Ghostwritten by Large Language Models"
☆15Updated last year
Alternatives and similar repositories for ghostbuster-data
Users that are interested in ghostbuster-data are comparing it to the libraries listed below
Sorting:
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"☆14Updated 2 months ago
- ☆36Updated 9 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 7 months ago
- ☆16Updated 7 years ago
- ☆17Updated 2 years ago
- Rust library for indexing and quickly searching large pretraining corpora☆27Updated 2 weeks ago
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆23Updated last year
- Minimalist BERT implementation assignment for CS11-711☆83Updated 2 years ago
- ☆19Updated last month
- Collection of academic works in natural language processing, computational linguistics, and computational cognitive science that study th…☆20Updated last year
- ☆48Updated last month
- Semantically Structured Sentence Embeddings☆66Updated 9 months ago
- ☆22Updated 3 years ago
- Code and data for "Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words"☆16Updated 3 years ago
- Utility for behavioral and representational analyses of Language Models☆153Updated 2 weeks ago
- My NER Experiments with ModernBERT and Ettin☆21Updated last week
- Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.☆54Updated last year
- ☆98Updated last year
- PathPiece tokenizer☆12Updated 8 months ago
- A Test Collection of Computer Science Papers for Faceted Query by Example☆21Updated 3 years ago
- ☆53Updated last year
- Multidocument Summarization for Literature Review Shared Task 2022☆30Updated 2 years ago
- ACL Rolling Review website☆11Updated this week
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆71Updated 2 years ago
- A lightweight Python library for constructing, processing, and visualizing constituent trees.☆66Updated 6 months ago
- ☆53Updated 3 years ago
- A Python library that encapsulates various methods for neuron interpretation and analysis in Deep NLP models.☆102Updated last year
- Public repository for SemEval 2023 - Task 10 - Explainable Detection of Online Sexism (EDOS)☆23Updated 2 years ago
- A survey of corpora for Germanic low-resource languages and dialects☆25Updated 7 months ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year