daveshap / PlainTextWikipediaLinks
Convert Wikipedia database dumps into plaintext files
☆324Updated 4 years ago
Alternatives and similar repositories for PlainTextWikipedia
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below
Sorting:
- Nearly a thousand bash and python scripts I've written over the years.☆124Updated 7 months ago
- Download subreddit comments☆96Updated 3 years ago
- Conversational text Analysis using various NLP techniques☆181Updated 2 years ago
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated 2 years ago
- 🧠 AI memory assistant – remember everything you read☆302Updated 2 years ago
- GPT Takes the Bar Exam☆142Updated 2 years ago
- A Flask webapp & Python scripts for predicting reddit users' political leaning, using their comment history.☆64Updated 2 years ago
- A tool to automatically turn any Wikipedia article into a video☆57Updated 3 years ago
- 📊 Semantic search for headlines and story text☆360Updated last year
- Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more☆221Updated last year
- An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000…☆58Updated 3 years ago
- Cleaning tool for web scraped text☆38Updated 2 years ago
- Reddit takeout: export your account data as JSON: comments, submissions, upvotes etc. 🦖☆174Updated 2 months ago
- Self-hosted GPT playground☆115Updated last year
- GPT2Explorer is bringing GPT2 OpenAI langage models playground to run locally on standard windows computers.☆28Updated 3 years ago
- Python code for building a GPT-3 based technical blog post optimizer.☆85Updated 3 years ago
- Official Trump Twitter Archive V2 source☆142Updated last year
- The Python script for downloading new mp3 from RSS given channels☆137Updated 6 months ago
- Espial is an engine for automated organization and discovery of personal knowledge☆174Updated 3 years ago
- Contains scripts and data to render map of reddit☆120Updated 4 months ago
- Concise answers to search queries using Google and GPT-3. Includes citations.☆80Updated 2 years ago
- ☆44Updated 4 years ago
- Unreliable News Index (for Columbia Journalism Review)☆56Updated 3 years ago
- Tools to construct and process Common Crawl webgraphs☆96Updated 3 weeks ago
- Dolores is a Python library designed to improve the developer experience when working with pretrained language models. Dolores provides p…☆34Updated 5 years ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆124Updated last year
- Chat interface to gpt-j. Runs in Google Colab.☆58Updated 2 years ago
- Sick of that "Save as PDF" link on Wikipedia? Why not just have Python do it for you?☆27Updated 5 years ago
- TextReducer - A Tool for Summarization and Information Extraction☆88Updated last year
- Library Genesis (libgen) CLI/TUI/GUI client (mirror from private repo)☆229Updated 4 years ago