Convert Wikipedia database dumps into plaintext files
☆331May 23, 2021Updated 4 years ago
Alternatives and similar repositories for PlainTextWikipedia
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Mar 23, 2022Updated 4 years ago
- ☆16May 20, 2022Updated 3 years ago
- Open source copy of my book Natural Language Cognitive Architecture☆163Mar 9, 2022Updated 4 years ago
- Public repo for my book Symphony of Thought: Orchestrating Artificial Cognition☆112Sep 7, 2022Updated 3 years ago
- GPT-3 chatbot with long-term memory and external sources☆621Jan 31, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated last year
- ☆21Updated this week
- A python module to process data for Frame Semantic Parsing☆23Nov 3, 2020Updated 5 years ago
- Embracing Novelty, Growth, and Genuine Experiences (ENGAGE)☆12Mar 31, 2023Updated 3 years ago
- Tool for cleaning old and redundant backups☆14Dec 26, 2025Updated 4 months ago
- Experiment to answer questions from arbitrary number of sources☆81Jun 25, 2022Updated 3 years ago
- Public experiment with prompt-chaining to generate critical arguments☆26Jun 6, 2022Updated 3 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Let's see what we can do with SCOTUS opinions☆26Dec 19, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- Tools for encoding Magic: The Gathering cards into a form suitable for AI text generation☆19May 2, 2021Updated 4 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆170Nov 7, 2022Updated 3 years ago
- SQLite grammar for tree-sitter☆23Jun 24, 2023Updated 2 years ago
- Multi user Jupyterhub with C++, Java, Python, Tensorflow, Julia, SQL, NodeJS, Bash and more!☆19Oct 6, 2021Updated 4 years ago
- KitanaQA: Adversarial training and data augmentation for neural question-answering models☆56Jul 23, 2023Updated 2 years ago
- 📈 All data from my life — location, health, work, play, and more — open sourced☆14Jul 5, 2022Updated 3 years ago
- A tutorial that shows the powerful capabilities of the computer algebra system SymPy for solving problems of high school math, calculus, …☆14Oct 5, 2021Updated 4 years ago
- Export your (or other people's) Goodreads data to SQLite☆90Aug 27, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Command-line program for organizing and managing ebook collections. It is a Python port from the original shell scripts ebook-tools☆22May 5, 2024Updated last year
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆37Mar 27, 2024Updated 2 years ago
- Codebase for running (conditional) probing experiments☆21Nov 13, 2022Updated 3 years ago
- ☆16Jul 23, 2023Updated 2 years ago
- ☆70Nov 30, 2022Updated 3 years ago
- Script to import youtube-dl metadata to PostgreSQL☆14Aug 13, 2018Updated 7 years ago
- A tool for extracting plain text from Wikipedia dumps☆3,982May 23, 2024Updated last year
- A PDF classifier ensemble with REST API service☆23Mar 5, 2021Updated 5 years ago
- CraftML is a restful web service for easy pipeline creation without code.☆13Apr 18, 2021Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Makes URLs prettier by removing the protocol prefix.☆11Jan 26, 2025Updated last year
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Sep 6, 2021Updated 4 years ago
- A fast and simple JavaScript library specifically targeted at collecting search and search related browser events.☆43Nov 20, 2025Updated 5 months ago
- ☆32Mar 14, 2017Updated 9 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- 📊 Semantic search for headlines and story text☆359Sep 23, 2023Updated 2 years ago
- An On-Premises, Streaming Speech Recognition System☆681Nov 28, 2021Updated 4 years ago