A command-line program to download text corpora.
☆34Aug 12, 2017Updated 8 years ago
Alternatives and similar repositories for corpus-downloader
Users that are interested in corpus-downloader are comparing it to the libraries listed below
Sorting:
- Scripts for scraping metadata from Project Gutenberg books, via GITenberg.☆19Sep 11, 2018Updated 7 years ago
- ENGL 87400 - Text Transformations (Graduate Center, CUNY - Spring 2015)☆12Mar 30, 2015Updated 10 years ago
- A structured list of text corpora, created for use with a corpus downloader.☆13Aug 27, 2016Updated 9 years ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Feb 27, 2024Updated 2 years ago
- The Art of Literary Text Analysis☆169Apr 4, 2019Updated 6 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Jun 6, 2016Updated 9 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Mar 6, 2018Updated 8 years ago
- Bayesian nonparametric models for python☆18Sep 11, 2018Updated 7 years ago
- spaCy-to-naf converter☆21Jun 10, 2025Updated 8 months ago
- Text-Induced Corpus Clean-up☆20Jun 20, 2023Updated 2 years ago
- Code for learning geographically-informed word embeddings☆22Feb 4, 2022Updated 4 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆49Jul 13, 2017Updated 8 years ago
- Breve☆29Jul 30, 2019Updated 6 years ago
- Ubiflux Vigor ventilation system RS485 Modbus communications with Python☆11Feb 20, 2026Updated 2 weeks ago
- InfiniteUlysses.com repo as it was when I finished the related Ph.D. project. See instead github.com/amandavisconti/infinite-ulysses-publ…☆26Mar 15, 2022Updated 3 years ago
- A fast, simple, multilingual tokenizer☆29May 24, 2017Updated 8 years ago
- Text Corpus of African American Fiction and Poetry, from 1853-1923☆10Aug 5, 2020Updated 5 years ago
- Evaluate your word embeddings☆35Dec 3, 2019Updated 6 years ago
- spaCy match and replace, maintaining conjugation☆36Dec 9, 2022Updated 3 years ago
- ☆10Jun 16, 2017Updated 8 years ago
- Hungarian tokenizer.☆14Mar 15, 2022Updated 3 years ago
- A starter codebase for a Windrift game☆11Nov 20, 2021Updated 4 years ago
- An online comic maker built by the State Library of Queensland for the international Fun Palaces event. Concept by Matt Finch, based on "…☆10Jan 27, 2017Updated 9 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- A set of base classes in order to perfom training scripts for Neural Networs ( by means of SNNS) and SVM ( by means of SVM Light and SVM …☆14Jun 24, 2011Updated 14 years ago
- Deploy a Ceramic daemon to AWS☆13Apr 18, 2023Updated 2 years ago
- ☆10Jul 2, 2019Updated 6 years ago
- ☆11Nov 14, 2021Updated 4 years ago
- Backbone.Schema allows developers to specify rich Backbone models and collections with JSON-Schema.☆38Apr 14, 2016Updated 9 years ago
- CSV inspection☆10Dec 20, 2022Updated 3 years ago
- Notebook for looking at 35 years of historical US degrees data from NCES-IPEDS☆11Dec 18, 2018Updated 7 years ago
- Materials accompanying the upstrap paper☆12Oct 19, 2020Updated 5 years ago
- Resources from the Question Generation Shared Task & Evaluation Challenge 2010☆12Dec 21, 2010Updated 15 years ago
- Example Scala Spray project☆12Jun 27, 2020Updated 5 years ago
- MDLText☆12Jul 13, 2017Updated 8 years ago
- PDF table extraction☆10Dec 14, 2021Updated 4 years ago
- Home Assistant custom component for Pollen Information in Hungary☆15Jul 17, 2024Updated last year
- Natural Language Processing tools☆12Jan 26, 2017Updated 9 years ago
- A python library for easily querying morphological inflection models trained on Unimorph☆13Oct 23, 2022Updated 3 years ago