A command-line program to download text corpora.
☆34Aug 12, 2017Updated 8 years ago
Alternatives and similar repositories for corpus-downloader
Users that are interested in corpus-downloader are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ENGL 87400 - Text Transformations (Graduate Center, CUNY - Spring 2015)☆12Mar 30, 2015Updated 11 years ago
- The Art of Literary Text Analysis☆169Apr 4, 2019Updated 7 years ago
- Plots various graphs for a series of plaintext files in a directory☆19Jun 6, 2016Updated 9 years ago
- Training a classifier to reddit's TIL to find new things on Wikipedia☆35Sep 25, 2015Updated 10 years ago
- Notebook for looking at 35 years of historical US degrees data from NCES-IPEDS☆11Dec 18, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Histonets is an application to convert images of scanned maps into digital networks☆20Oct 16, 2017Updated 8 years ago
- spaCy-to-naf converter☆21Jun 10, 2025Updated 11 months ago
- A structured list of text corpora, created for use with a corpus downloader.☆13Aug 27, 2016Updated 9 years ago
- A digital humanities operating system that runs on a USB disk.☆32Jul 5, 2017Updated 8 years ago
- Scala library that shells to Tesseract to make PDFs searchable☆16Jan 25, 2019Updated 7 years ago
- Client to browse and edit PeriodO data☆17Apr 26, 2026Updated last week
- Document classification with Apache Spark on an American Classic☆10Sep 25, 2015Updated 10 years ago
- XSLT for converting TEI MsDescription to IIIF manifests☆13Oct 18, 2016Updated 9 years ago
- Regex like pattern tree matching but on sentence's tree instead of Strings☆42Mar 6, 2018Updated 8 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Materials related to the Project Laboratory session of #GCDRI☆16Nov 29, 2017Updated 8 years ago
- DBpedia Neural Question Answering Dataset☆18Jun 28, 2020Updated 5 years ago
- (Mental) maps of texts with kernel density estimation and force-directed networks.☆108Jun 22, 2015Updated 10 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Apr 30, 2023Updated 3 years ago
- The PELAGIOS Cookbook☆25Jan 16, 2021Updated 5 years ago
- InfiniteUlysses.com repo as it was when I finished the related Ph.D. project. See instead github.com/amandavisconti/infinite-ulysses-publ…☆26Mar 15, 2022Updated 4 years ago
- ☆12Apr 21, 2026Updated 2 weeks ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆49Jul 13, 2017Updated 8 years ago
- Twitter API session at GC Digital Research Bootcamp☆19May 18, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A design prototype for DocNow to learn with☆14Apr 8, 2017Updated 9 years ago
- Annotator PouchDB Storage Plugin☆11Apr 8, 2016Updated 10 years ago
- Wrapper for DKPro Core to extract lingustic information from books.☆16Feb 26, 2022Updated 4 years ago
- A poetry generator from a scrapped corpus of Spanish poetry. EDA and general NLP task included.☆12Jan 21, 2021Updated 5 years ago
- This repository contains material on NLP (prepared for a talk for R-Ladies Bergen)☆10Dec 9, 2020Updated 5 years ago
- A plugin that provides support for working with Digital Facsimiles in Text Encoding Initiative (TEI) vocabulary. The plugin contribute…☆25Jun 16, 2025Updated 10 months ago
- Traducción al español del curso R_Programming de swirl☆19Apr 19, 2021Updated 5 years ago
- Scripts to make Pokémon Crystal accessible to screen readers☆18Apr 5, 2020Updated 6 years ago
- ☆16Oct 5, 2018Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Digital Pedagogy in the Humanities: Concepts, Models, and Experiments☆130Oct 16, 2021Updated 4 years ago
- Data science and machine learning resources for screen reader users☆21Jun 24, 2023Updated 2 years ago
- Readable.cc is a readable news reader.☆36Nov 24, 2014Updated 11 years ago
- Archives for Black Lives in Philly was inspired by Jarrett Drake, Digital Archivist at Princeton University, and his work to end archives…☆19May 5, 2022Updated 4 years ago
- SWAN: Saar Web-based ANotation system☆14May 16, 2019Updated 6 years ago
- ☆13Aug 29, 2021Updated 4 years ago
- ☆14Mar 24, 2026Updated last month