daveshap / PlainTextWikipediaLinks
Convert Wikipedia database dumps into plaintext files
☆324Updated 4 years ago
Alternatives and similar repositories for PlainTextWikipedia
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below
Sorting:
- Nearly a thousand bash and python scripts I've written over the years.☆123Updated 8 months ago
- Download subreddit comments☆96Updated 3 years ago
- Conversational text Analysis using various NLP techniques☆182Updated 2 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆243Updated 2 years ago
- Pipeline to generate the Standardized Project Gutenberg Corpus☆203Updated last year
- The world's largest social media toxicity dataset.☆187Updated 3 years ago
- GPT Takes the Bar Exam☆142Updated 2 years ago
- Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more☆221Updated last year
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆124Updated last year
- 🧠 AI memory assistant – remember everything you read☆302Updated 2 years ago
- 📊 Semantic search for headlines and story text☆361Updated 2 years ago
- A tool to automatically turn any Wikipedia article into a video☆57Updated 3 years ago
- ☆61Updated 2 years ago
- An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000…☆58Updated 3 years ago
- Tools to construct and process Common Crawl webgraphs☆98Updated last week
- GPT-3 Explorer☆208Updated 5 years ago
- ☆81Updated 6 years ago
- The AI Knowledge Editor☆185Updated 3 years ago
- Python code for building a GPT-3 based technical blog post optimizer.☆84Updated 3 years ago
- Unreliable News Index (for Columbia Journalism Review)☆56Updated 3 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆184Updated last week
- Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.☆226Updated 5 years ago
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆55Updated 3 years ago
- A client for OpenAI's GPT-3 API for ad hoc testing of prompt without using the web interface.☆89Updated 5 years ago
- This AI Does Not Exist: generate realistic descriptions of made-up machine learning models.☆147Updated 3 years ago
- A Python Package which helps to scrape all news details from any news websites☆219Updated 4 months ago
- Neural Search☆333Updated last year
- Chat interface to gpt-j. Runs in Google Colab.☆58Updated 2 years ago
- Statistics of Common Crawl monthly archives mined from URL index files☆193Updated last week
- TextReducer - A Tool for Summarization and Information Extraction☆87Updated last year