taivop / joke-dataset
A dataset of 200k English plaintext jokes.
☆604Updated 2 years ago
Alternatives and similar repositories for joke-dataset:
Users that are interested in joke-dataset are comparing it to the libraries listed below
- Python scripts for building 'Short Jokes' dataset, featured on Kaggle☆273Updated 4 years ago
- A dataset containing story plots from Wikipedia (books, movies, etc.) and the code for the extractor.☆313Updated 7 years ago
- Collection of tools for building diachronic/historical word vectors☆422Updated last year
- Interactive Lecture Notes, Slides and Exercises for Statistical NLP☆270Updated 5 years ago
- Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive Language", ICWSM 2017☆791Updated last year
- A simple interface to the Project Gutenberg corpus.☆323Updated 2 years ago
- Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts☆3,400Updated 2 years ago
- I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made this☆216Updated last year
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆310Updated 2 years ago
- A corpus of 100,000 happy moments☆363Updated 7 years ago
- Code for Defending Against Neural Fake News, https://rowanzellers.com/grover/☆918Updated last year
- Test prompts for OpenAI's GPT-3 API and the resulting AI-generated texts.☆702Updated 4 years ago
- A large corpus of discourse annotations and relations on ~10K forum threads.☆238Updated 6 years ago
- Community Curated NLP List☆198Updated 2 years ago
- ☆145Updated last year
- Code for the paper "Language Models are Unsupervised Multitask Learners"☆1,147Updated 2 years ago
- Uses NLP and wikipedia to try to generate trivia questions☆130Updated 7 years ago
- Giant Language Model Test Room☆466Updated last year
- The repo containing the Critical Role Dungeons and Dragons Dataset.☆129Updated 5 months ago
- Ten thousand books, six million ratings☆837Updated last year
- Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenizati…☆664Updated 11 months ago
- ☆1,296Updated 2 years ago
- Automatically generate headlines to short articles☆524Updated 6 years ago
- Dataset of GPT-2 outputs for research in detection, biases, and more☆1,956Updated last year
- An open clone of the GPT-2 WebText dataset by OpenAI. Still WIP.☆387Updated 10 months ago
- Default English stopword lists from many different sources☆294Updated last year
- ☆324Updated 3 weeks ago
- A repository to house model building experiments and tools that are part of the Conversation AI effort.☆139Updated this week
- analyze text with empath☆320Updated 7 years ago
- word2vec Google News model slimmed down to 300k English words☆216Updated 7 years ago