tedunderwood / DataMungingLinks
Scripts that clean up OCR and munge Hathi metadata.
☆76Updated 7 years ago
Alternatives and similar repositories for DataMunging
Users that are interested in DataMunging are comparing it to the libraries listed below
Sorting:
- Tools for working with HTRC Feature Extraction files☆39Updated 5 months ago
- A digital humanities operating system that runs on a USB disk.☆31Updated 7 years ago
- ☆19Updated 8 years ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆25Updated 2 years ago
- Python implementation of the Zeta score for contrastive text analysis☆14Updated 3 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 7 years ago
- High-performance text aligner for large collections of texts☆51Updated 3 weeks ago
- Project on the history of genre.☆23Updated 5 years ago
- A textual corpus database for the digital humanities.☆62Updated 4 years ago
- ☆28Updated 4 years ago
- A point-and-click tool for creating and analyzing topic models produced by MALLET.☆108Updated 4 years ago
- Humanities Data Curation Record☆11Updated 7 years ago
- Digital Humanities Across Borders☆48Updated last year
- This is a public repository for sharing, improving, and versioning "The Topic Modeling Game," a lesson developed by Lisa Rhody to teach t…☆10Updated 7 years ago
- A python client for the DPLA API☆43Updated 2 years ago
- Contains materials for a work in progress - "A Humanist's Cookbook for Natural Language Processing in Python."☆40Updated 3 years ago
- Code and data to support the article, "How quickly do literary standards change?"☆22Updated 7 years ago
- Text collections made available by the CLiGS group.☆23Updated 3 years ago
- A Python library for topic modeling and visualization☆65Updated 4 years ago
- the EEBO TCP texts☆34Updated 7 years ago
- The GitHub repository for the AI for Humanists Project☆18Updated last month
- Sharable scripts and stylesheets from the Northeastern University Women Writers Project☆23Updated last month
- Files for the On The Books project☆35Updated 6 months ago
- A deep learning architecture for reference mining from literature in the arts and humanities.☆16Updated 5 years ago
- The Open Scholarly Edition of James Joyce's A Portrait of the Artist as a Young Man☆20Updated 6 years ago
- EFES (EpiDoc Front End Services) is a custom and readily customizable platform for publication and search/indexing of EpiDoc files, based…☆31Updated 4 months ago
- Early Novels Database dataset☆16Updated 6 years ago
- Special Topics in AI: Artificial Intelligence as an Archival Science☆17Updated last year
- An R package for analysis of dramatic texts☆15Updated 2 years ago
- Topic Modeling Workflow in Python☆16Updated 2 years ago