Convert Wikipedia database dumps into plaintext files
☆328May 23, 2021Updated 4 years ago
Alternatives and similar repositories for PlainTextWikipedia
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Detecting gibberish as a type of sentiment analysis with GPT2☆25Nov 10, 2020Updated 5 years ago
- Arabic Grapheme-to-Phoneme (G2P) Conversion☆13Mar 15, 2025Updated last year
- Tool for the Automatic Analysis of Syntactic Sophistication and Complexity☆30Nov 4, 2023Updated 2 years ago
- ☆14Sep 21, 2022Updated 3 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- Tools for encoding Magic: The Gathering cards into a form suitable for AI text generation☆19May 2, 2021Updated 4 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆169Nov 7, 2022Updated 3 years ago
- ☆30Jun 2, 2025Updated 10 months ago
- KitanaQA: Adversarial training and data augmentation for neural question-answering models☆56Jul 23, 2023Updated 2 years ago
- Export your (or other people's) Goodreads data to SQLite☆90Aug 27, 2020Updated 5 years ago
- A tutorial that shows the powerful capabilities of the computer algebra system SymPy for solving problems of high school math, calculus, …☆14Oct 5, 2021Updated 4 years ago
- Rob Pike's simple regex matcher converted to Go☆11Aug 14, 2022Updated 3 years ago
- Generate a SQLite database from Wikipedia & Wikidata dumps.☆36Mar 27, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆32Jun 16, 2021Updated 4 years ago
- ☆16Jul 23, 2023Updated 2 years ago
- Codebase for running (conditional) probing experiments☆21Nov 13, 2022Updated 3 years ago
- ☆70Nov 30, 2022Updated 3 years ago
- A tool for extracting plain text from Wikipedia dumps☆3,976May 23, 2024Updated last year
- A PDF classifier ensemble with REST API service☆23Mar 5, 2021Updated 5 years ago
- Explainable Zero-Shot Topic Extraction☆65Aug 19, 2024Updated last year
- Airbnb clone with Ruby on Rails☆11Aug 20, 2017Updated 8 years ago
- A collection of resources related to mindfiles (digital representations of your mind)☆11Nov 11, 2019Updated 6 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- A fast and simple JavaScript library specifically targeted at collecting search and search related browser events.☆43Nov 20, 2025Updated 4 months ago
- ☆32Mar 14, 2017Updated 9 years ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆32Apr 29, 2021Updated 4 years ago
- 📊 Semantic search for headlines and story text☆359Sep 23, 2023Updated 2 years ago
- Finds packages that require updates on a python environment☆22Updated this week
- An On-Premises, Streaming Speech Recognition System☆682Nov 28, 2021Updated 4 years ago
- Python builder for ASGI applications on Zeit Now☆14Jun 24, 2022Updated 3 years ago
- an experimental implementation of Burrow's delta in Python 3☆21Oct 1, 2021Updated 4 years ago
- A basic Rust OCI container runtime☆13Jan 8, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Implementation of the ClausIE information extraction system for python+spacy☆226Aug 8, 2022Updated 3 years ago
- A robust Python tool for text-based AI training and generation using GPT-2.☆1,839Jul 14, 2023Updated 2 years ago
- codebase for the Text-based NP Enrichment (TNE) paper☆19Mar 12, 2024Updated 2 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Feb 21, 2026Updated last month
- This is a reddit bot based on OpenAi's GPT-2 117M model☆100Aug 27, 2019Updated 6 years ago
- Computational exploration of magical and divinatory language☆24Jan 14, 2020Updated 6 years ago
- 📝Agora is an off-chain governance tool for decentralized communities.☆16Mar 14, 2025Updated last year