A command-line program to download text corpora.
☆34Aug 12, 2017Updated 8 years ago
Alternatives and similar repositories for corpus-downloader
Users that are interested in corpus-downloader are comparing it to the libraries listed below
Sorting:
- Use spaCy for NLP and output to the FoLiA XML format.☆12Feb 27, 2024Updated 2 years ago
- The Art of Literary Text Analysis☆169Apr 4, 2019Updated 6 years ago
- DBpedia Neural Question Answering Dataset☆18Jun 28, 2020Updated 5 years ago
- Python implementation of the Zeta score for contrastive text analysis☆14Jun 16, 2021Updated 4 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Mar 6, 2018Updated 8 years ago
- Bayesian nonparametric models for python☆18Sep 11, 2018Updated 7 years ago
- spaCy-to-naf converter☆21Jun 10, 2025Updated 8 months ago
- Code for learning geographically-informed word embeddings☆22Feb 4, 2022Updated 4 years ago
- Ubiflux Vigor ventilation system RS485 Modbus communications with Python☆11Feb 20, 2026Updated 2 weeks ago
- ☆25Apr 28, 2020Updated 5 years ago
- Breve☆29Jul 30, 2019Updated 6 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆40Jun 18, 2019Updated 6 years ago
- Scripts that clean up OCR and munge Hathi metadata.☆77Nov 4, 2017Updated 8 years ago
- A fast, simple, multilingual tokenizer☆29May 24, 2017Updated 8 years ago
- Scripts to create git repositories for ALTO XML texts, like those from the British Library's scanned documents.☆31Nov 3, 2017Updated 8 years ago
- Text Corpus of African American Fiction and Poetry, from 1853-1923☆10Aug 5, 2020Updated 5 years ago
- Evaluate your word embeddings☆35Dec 3, 2019Updated 6 years ago
- spaCy match and replace, maintaining conjugation☆36Dec 9, 2022Updated 3 years ago
- Lightweight, multilingual natural language processing☆63Apr 8, 2013Updated 12 years ago
- ☆10Jul 2, 2019Updated 6 years ago
- An online comic maker built by the State Library of Queensland for the international Fun Palaces event. Concept by Matt Finch, based on "…☆10Jan 27, 2017Updated 9 years ago
- A starter codebase for a Windrift game☆11Nov 20, 2021Updated 4 years ago
- A set of base classes in order to perfom training scripts for Neural Networs ( by means of SNNS) and SVM ( by means of SVM Light and SVM …☆14Jun 24, 2011Updated 14 years ago
- Simple CORPORA list crawler☆10Dec 2, 2016Updated 9 years ago
- Hungarian tokenizer.☆14Mar 15, 2022Updated 3 years ago
- Training a classifier to reddit's TIL to find new things on Wikipedia☆34Sep 25, 2015Updated 10 years ago
- Deploy a Ceramic daemon to AWS☆13Apr 18, 2023Updated 2 years ago
- ☆10Jun 16, 2017Updated 8 years ago
- Supreme Court prediction model, "version" 2☆50Apr 24, 2017Updated 8 years ago
- CSV inspection☆10Dec 20, 2022Updated 3 years ago
- Spark on Docker Swarm example code☆11Nov 27, 2016Updated 9 years ago
- A python library for easily querying morphological inflection models trained on Unimorph☆13Oct 23, 2022Updated 3 years ago
- USAAR participation in SemEval2015☆11Dec 21, 2022Updated 3 years ago
- An open-source cross-platform PDF reader with built-in hypothes.is annotations☆10Mar 5, 2016Updated 10 years ago
- Backbone.Schema allows developers to specify rich Backbone models and collections with JSON-Schema.☆38Apr 14, 2016Updated 9 years ago
- MDLText☆12Jul 13, 2017Updated 8 years ago
- A practical introduction to Docker for data science☆10May 13, 2019Updated 6 years ago
- ☆10Jul 25, 2016Updated 9 years ago
- ☆12Jan 13, 2026Updated last month