Convert Wikipedia database dumps into plaintext files
☆326May 23, 2021Updated 4 years ago
Alternatives and similar repositories for PlainTextWikipedia
Users that are interested in PlainTextWikipedia are comparing it to the libraries listed below
Sorting:
- ☆13Jul 1, 2023Updated 2 years ago
- ☆21Updated this week
- A python module to process data for Frame Semantic Parsing☆23Nov 3, 2020Updated 5 years ago
- ☆15Mar 11, 2024Updated 2 years ago
- ☆14Sep 21, 2022Updated 3 years ago
- The tools used with my "A dive into the world of MS-DOS viruses" talk☆40Jan 4, 2019Updated 7 years ago
- Github Actions wrapper for vmtest☆12Jan 10, 2025Updated last year
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Tools for encoding Magic: The Gathering cards into a form suitable for AI text generation☆19May 2, 2021Updated 4 years ago
- Desktop containers for RancherOS☆15Sep 7, 2017Updated 8 years ago
- 📈 All data from my life — location, health, work, play, and more — open sourced☆14Jul 5, 2022Updated 3 years ago
- A custom element that aims to make it easier to embed Spring '83 boards☆16Jul 15, 2022Updated 3 years ago
- ☆32Jun 16, 2021Updated 4 years ago
- Codebase for running (conditional) probing experiments☆22Nov 13, 2022Updated 3 years ago
- ☆70Nov 30, 2022Updated 3 years ago
- Script to import youtube-dl metadata to PostgreSQL☆14Aug 13, 2018Updated 7 years ago
- A tool for extracting plain text from Wikipedia dumps☆3,971May 23, 2024Updated last year
- CraftML is a restful web service for easy pipeline creation without code.☆13Apr 18, 2021Updated 4 years ago
- Search and download accepted papers from machine learning conferences☆34Apr 10, 2023Updated 2 years ago
- Code for pre-training CharacterBERT models (as well as BERT models).☆34Sep 6, 2021Updated 4 years ago
- Convert the IBM MDA font to PNG and vice-versa☆28Jan 8, 2021Updated 5 years ago
- Indexation des fichiers de décès publiés par l'INSEE☆21Sep 3, 2023Updated 2 years ago
- Download suggested queries based on your input generated by Google Suggest API☆19May 9, 2021Updated 4 years ago
- 📊 Semantic search for headlines and story text☆359Sep 23, 2023Updated 2 years ago
- Finds packages that require updates on a python environment☆22Updated this week
- An On-Premises, Streaming Speech Recognition System☆682Nov 28, 2021Updated 4 years ago
- ☆11Mar 26, 2017Updated 8 years ago
- A robust Python tool for text-based AI training and generation using GPT-2.☆1,839Jul 14, 2023Updated 2 years ago
- codebase for the Text-based NP Enrichment (TNE) paper☆19Mar 12, 2024Updated 2 years ago
- eXtended Memory Manager (XMM) that can manage memory beyond the 4 GB barrier, up to 1 terabyte.☆48Mar 3, 2026Updated 2 weeks ago
- Python based Wikidata framework for easy dataframe extraction☆45Feb 21, 2026Updated last month
- Thoughts toward and tutorial on corpus-driven narrative generation☆25Nov 5, 2020Updated 5 years ago
- Computational exploration of magical and divinatory language☆24Jan 14, 2020Updated 6 years ago
- Header-only C++/python library for fast approximate nearest neighbors☆18Feb 9, 2020Updated 6 years ago
- Création, gestion et échange d'autoblogs (version 0.3)☆46Feb 10, 2025Updated last year
- Code release for Type-Aware Bi-Encoders for Open-Domain Entity Retrieval☆19Sep 24, 2022Updated 3 years ago
- Add website scraping abilities to Datasette☆66Mar 4, 2023Updated 3 years ago
- A 🤗-style implementation of BERT using lambda layers instead of self-attention☆69Oct 19, 2020Updated 5 years ago
- Wait for a commit's check suites to complete.☆14Aug 30, 2023Updated 2 years ago