bamman-group / gpt4-books
Code and data to support "Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4"
☆67Updated last year
Related projects: ⓘ
- ☆19Updated last year
- Finding semantically meaningful and accurate prompts.☆45Updated 10 months ago
- ☆29Updated last year
- Libraries, Archives and Museums (LAM)☆81Updated last year
- Documentation effort for the BookCorpus dataset☆30Updated 3 years ago
- ☆27Updated last year
- ☆33Updated 2 years ago
- ☆31Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago
- ☆67Updated 6 months ago
- ☆19Updated 4 months ago
- datasets from the paper "Towards Understanding Sycophancy in Language Models"☆59Updated 10 months ago
- ☆81Updated 3 months ago
- ☆16Updated last year
- ☆49Updated last year
- analysis of public NLP corpora☆12Updated last year
- ☆44Updated 2 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆21Updated this week
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.☆11Updated 9 months ago
- An experiment replicating part of "Why Literary Time is Measured in Minutes" with GPT-4.☆32Updated last year
- ☆38Updated 5 months ago
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- Submission to the inverse scaling prize☆23Updated last year
- ☆30Updated 4 years ago
- Are foundation LMs multilingual knowledge bases? (EMNLP 2023)☆18Updated 9 months ago
- GPT-4 Passes the Bar☆21Updated 9 months ago
- Code for our EMNLP '22 paper "Fixing Model Bugs with Natural Language Patches"☆19Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆25Updated last month