DH-Box / corpus-downloaderLinks
A command-line program to download text corpora.
☆34Updated 8 years ago
Alternatives and similar repositories for corpus-downloader
Users that are interested in corpus-downloader are comparing it to the libraries listed below
Sorting:
- A point-and-click tool for creating and analyzing topic models produced by MALLET.☆111Updated 4 years ago
- System for building, visualizing, and working with LDA topic models☆97Updated this week
- (Mental) maps of texts with kernel density estimation and force-directed networks.☆108Updated 10 years ago
- The Art of Literary Text Analysis☆168Updated 6 years ago
- A Python library for topic modeling and visualization☆66Updated 5 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 8 years ago
- Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.co…☆315Updated 3 years ago
- A textual corpus database for the digital humanities.☆62Updated 5 years ago
- A digital humanities operating system that runs on a USB disk.☆32Updated 8 years ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆127Updated 4 years ago
- Netherlands eScience Center - Shifting Concepts Through Time project☆27Updated 3 years ago
- Scripts that clean up OCR and munge Hathi metadata.☆77Updated 8 years ago
- Project on the history of genre.☆23Updated 5 years ago
- Detect and align similar passages☆111Updated last month
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆61Updated 7 years ago
- Practical Approaches to Data Science with Text☆39Updated 5 years ago
- Tools for text tokenization and encoding☆84Updated 4 years ago
- Take a MALLET to disciplinary history☆99Updated 3 years ago
- Course materials for Introduction to Computational Literary Analysis, taught at UC Berkeley in Summer 2018, 2019, and 2020, at Columbia U…☆90Updated 3 years ago
- Graph-based tool for disambiguation and linking of named entities to Linked Data sets for Digital Humanities and heritage texts☆27Updated 4 years ago
- An implementation of latent Dirichlet allocation in javascript☆185Updated 3 years ago
- Python implementation of the Zeta score for contrastive text analysis☆14Updated 4 years ago
- Quantitative Text Analysis for the digitale Geisteswissenschaften☆47Updated 10 years ago
- Data Server for Topic Models☆122Updated 2 years ago
- Workshop materials for our DH2018 workshop on word vectors. Created by Eun Seo Jo, Javier de la Rosa, and Scott Bailey☆15Updated 7 years ago
- SerendipSlim is a visualization tool for exploring topic models built on large collections of text documents.☆39Updated 7 years ago
- Explore your own text collection with a topic model – without prior knowledge.☆65Updated last month
- The Python-language successor to the TABARI event-data coding software.☆45Updated 8 years ago
- Citation Classification using hybrid neural network model for Wikipedia References☆30Updated 2 years ago
- This repository contains the Framester resource, the main outcome of the framester project.☆33Updated 3 weeks ago