daveshap / PlainTextWikipedia
Convert Wikipedia database dumps into plaintext files
☆318Updated 3 years ago
Alternatives and similar repositories for PlainTextWikipedia:
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below
- Nearly a thousand bash and python scripts I've written over the years.☆121Updated 2 months ago
- Espial is an engine for automated organization and discovery of personal knowledge☆176Updated 2 years ago
- The world's largest social media toxicity dataset.☆177Updated 2 years ago
- Download subreddit comments☆94Updated 3 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- GPT Takes the Bar Exam☆141Updated 2 years ago
- Example scripts for the pushshift dump files☆345Updated 2 weeks ago
- Python code for building a GPT-3 based technical blog post optimizer.☆84Updated 2 years ago
- TextReducer - A Tool for Summarization and Information Extraction☆87Updated 11 months ago
- OpenAI API webserver☆187Updated 3 years ago
- Fine tune GPT-2 with your favourite authors☆72Updated last year
- Conversational text Analysis using various NLP techniques☆181Updated last year
- Contains scripts and data to render map of reddit☆106Updated last year
- ☆130Updated 2 years ago
- ☆45Updated 3 years ago
- GPT-3 Explorer☆207Updated 4 years ago
- Bookmark with a snooze button. Bookmark, buffer and complete your reading list.☆91Updated 2 years ago
- A comprehensive Data and Text Mining workflow for submissions and comments from any given public subreddit.☆489Updated 5 years ago
- Inference code for LLaMA models☆188Updated 2 years ago
- Beat Writer's Block with AI☆146Updated 2 years ago
- ☆30Updated 3 years ago
- Strip non-presentational content out of HTML pages☆45Updated 2 years ago
- A client for OpenAI's GPT-3 API for ad hoc testing of prompt without using the web interface.☆90Updated 4 years ago
- Report and source code detailing the AI Dungeon private adventure vulnerability☆66Updated 3 years ago
- Dump of generated texts from GPT-2 trained on Hacker News titles☆117Updated 5 years ago
- Labelling platform for text using weak supervision.