daveshap / PlainTextWikipedia
Convert Wikipedia database dumps into plaintext files
☆319Updated 3 years ago
Alternatives and similar repositories for PlainTextWikipedia:
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below
- Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine☆242Updated last year
- A GPT-J API to use with python3 to generate text, blogs, code, and more☆205Updated 2 years ago
- Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more☆216Updated last year
- TextReducer - A Tool for Summarization and Information Extraction☆87Updated 11 months ago
- Multi-angle c(q)uestion answering☆458Updated 2 years ago
- The AI Knowledge Editor☆182Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- 📊 Semantic search for headlines and story text☆360Updated last year
- Example scripts for the pushshift dump files☆357Updated 2 weeks ago
- A Flask webapp & Python scripts for predicting reddit users' political leaning, using their comment history.☆65Updated last year
- ☆30Updated 3 years ago
- ☆81Updated 6 years ago
- An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000…☆57Updated 3 years ago
- Conversational text Analysis using various NLP techniques☆181Updated last year
- SFGram (Science-Fiction Gram) is a dataset of public science-fiction novels, books and movie covers. It is designed to be used by researc…☆31Updated 6 years ago
- NewsMap JS - JS implementation of the defunct newsmap.jp☆108Updated 4 months ago
- A tool to automatically turn any Wikipedia article into a video☆56Updated 2 years ago
- ☆104Updated last year
- experiment to generate novel-length fiction from a single story premise☆29Updated 2 years ago
- This Python module can be used to obtain antonyms, synonyms, hypernyms, hyponyms, homophones and definitions.☆123Updated 10 months ago
- Releases for the reddit-graph project☆19Updated 9 months ago
- Command line tool to convert a file in the WARC format to a file in the ZIM format☆56Updated last month
- Self-hosted GPT playground☆113Updated 8 months ago
- Public MVP of Raven. It's been long enough, time to do a full send.☆34Updated 2 years ago
- A set of utility scripts to process Wikipedia related data☆38Updated 2 years ago
- Espial is an engine for automated organization and discovery of personal knowledge☆175Updated 2 years ago
- Public repo for Raven MVP☆17Updated 3 years ago
- The Core Objective Functions are the solution to the Control Problem. They will result in a benevolent and trustworthy AGI.☆25Updated 3 years ago
- 💭 Retrieval augmented generation (RAG) and language model powered search applications☆288Updated this week
- Quote extraction for modular journalism (JournalismAI collab 2021)☆227Updated 3 years ago