tedunderwood / DataMunging
Scripts that clean up OCR and munge Hathi metadata.
☆74Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for DataMunging
- A digital humanities operating system that runs on a USB disk.☆30Updated 7 years ago
- Early Novels Database dataset☆16Updated 5 years ago
- High-performance text aligner for large collections of texts☆44Updated 2 weeks ago
- Tools for working with HTRC Feature Extraction files☆39Updated 2 years ago
- the EEBO TCP texts☆31Updated 6 years ago
- ☆28Updated 3 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 7 years ago
- Project on the history of genre.☆22Updated 4 years ago
- Python implementation of the Zeta score for contrastive text analysis☆14Updated 3 years ago
- Detect and align similar passages☆88Updated 2 months ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆24Updated 2 years ago
- A textual corpus database for the digital humanities.☆59Updated 4 years ago
- EFES (EpiDoc Front End Services) is a custom and readily customizable platform for publication and search/indexing of EpiDoc files, based…☆31Updated 4 months ago
- An R package for analysis of dramatic texts☆15Updated last year
- Python package for harvesting records from OAI-PMH provider(s).☆62Updated 2 years ago
- ☆19Updated 7 years ago
- Text collections made available by the CLiGS group.☆22Updated 2 years ago
- The Open Scholarly Edition of James Joyce's A Portrait of the Artist as a Young Man☆20Updated 5 years ago
- Topic Modeling Workflow in Python☆16Updated last year
- Diachronic Spanish Sonnet Corpus. Canonical and minor authors in Spanish (Europe, America and Asia): 15th to 20th century☆15Updated last year
- A python client for the DPLA API☆43Updated 2 years ago
- A command-line program to download text corpora.☆33Updated 7 years ago
- ALTO XML schema - latest and all former versions☆51Updated 3 months ago
- Tutorial on NE processing for Digital Humanities - DH Utrech 2019☆25Updated 5 years ago
- A point-and-click tool for creating and analyzing topic models produced by MALLET.☆106Updated 3 years ago
- Netherlands eScience Center - Shifting Concepts Through Time project☆26Updated 2 years ago
- Medieval Manuscripts in Oxford Libraries: TEI catalogue descriptions☆33Updated this week
- Digital Humanities Across Borders☆46Updated 7 months ago
- A trend viewer written in Python/JavaScript☆21Updated 2 weeks ago