tedunderwood / DataMunging
Scripts that clean up OCR and munge Hathi metadata.
☆76Updated 7 years ago
Alternatives and similar repositories for DataMunging:
Users that are interested in DataMunging are comparing it to the libraries listed below
- Tools for working with HTRC Feature Extraction files☆39Updated 4 months ago
- A digital humanities operating system that runs on a USB disk.☆31Updated 7 years ago
- Code and data to support the article, "How quickly do literary standards change?"☆22Updated 7 years ago
- A point-and-click tool for creating and analyzing topic models produced by MALLET.☆108Updated 4 years ago
- Workshop materials for our DH2018 workshop on word vectors. Created by Eun Seo Jo, Javier de la Rosa, and Scott Bailey☆15Updated 6 years ago
- ☆19Updated 8 years ago
- Project on the history of genre.☆22Updated 5 years ago
- ☆28Updated 4 years ago
- the EEBO TCP texts☆34Updated 7 years ago
- Special Topics in AI: Artificial Intelligence as an Archival Science☆17Updated 11 months ago
- Detect and align similar passages☆100Updated 3 months ago
- Early Novels Database dataset☆16Updated 6 years ago
- Diachronic Spanish Sonnet Corpus. Canonical and minor authors in Spanish (Europe, America and Asia): 15th to 20th century☆16Updated last year
- EFES (EpiDoc Front End Services) is a custom and readily customizable platform for publication and search/indexing of EpiDoc files, based…☆31Updated 3 months ago
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 7 years ago
- Download and manipulate HathiTrust wordcount data in the tidyverse☆9Updated 3 years ago
- High-performance text aligner for large collections of texts☆51Updated 2 weeks ago
- This is a public repository for sharing, improving, and versioning "The Topic Modeling Game," a lesson developed by Lisa Rhody to teach t…☆10Updated 6 years ago
- Python for Humanities☆13Updated last week
- Sharable scripts and stylesheets from the Northeastern University Women Writers Project☆22Updated 3 weeks ago
- Personal modeling application for Linked Data.☆26Updated 6 years ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆25Updated 2 years ago
- Digital Humanities Across Borders☆48Updated last year
- Text collections made available by the CLiGS group.☆23Updated 3 years ago
- Exercises for the XQuery Workshops at XQuery at DH2017☆50Updated 6 years ago
- Text Re-use Alignment Visualization☆38Updated 7 years ago
- Code and data supporting "NovelTM Data Sets for English-Language Fiction."☆24Updated 4 years ago
- Modeling and visualizing physical manuscript collation☆51Updated 2 years ago