dbamman / book-nlp
Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.com/booknlp/booknlp)
☆309Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for book-nlp
- Collection of tools for building diachronic/historical word vectors☆422Updated 11 months ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆124Updated 3 years ago
- Sample implementation of a politeness model, trained on the Stanford Politeness Corpus☆148Updated 2 years ago
- A point-and-click tool for creating and analyzing topic models produced by MALLET.☆106Updated 3 years ago
- English data☆201Updated this week
- Retrofitting Word Vectors to Semantic Lexicons☆374Updated 5 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆77Updated 10 months ago
- Various utilities for processing the data.☆207Updated this week
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆342Updated last year
- Practical Natural Language Processing Tools for Humans. Dependency Parsing, Syntactic Constituent Parsing, Semantic Role Labeling, Named …☆192Updated 7 years ago
- PredPatt: Predicate-Argument Extraction from Universal Dependencies☆112Updated 3 years ago
- A Python wrapper around the topic modeling functions of MALLET.☆99Updated 3 weeks ago
- Software and resources for natural language processing.☆131Updated 8 years ago
- A toolkit for corpus linguistics☆199Updated 5 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated last year
- ☆32Updated 2 years ago
- A command-line program to download text corpora.☆33Updated 7 years ago
- Take a MALLET to disciplinary history☆99Updated 2 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆27Updated 5 months ago
- Python port of the Twokenize class of ark-tweet-nlp☆141Updated 6 years ago
- Cross-lingual metaphor detection.☆66Updated 5 years ago
- The official released annotations, both in .prop pointer format and as conll files. Does not contain the source texts☆136Updated 2 years ago
- Automatically exported from code.google.com/p/universal-pos-tags☆128Updated 2 years ago
- Corpus of Spanish Golden-Age Sonnets (with metrical annotation) / Corpus de Sonetos del Siglo de Oro (con anotación métrica)☆34Updated last year
- Python port of Mikolov's word2phrase.c from the word2vec toolkit☆112Updated 4 years ago
- Socially-Equitable Language Identification☆78Updated last year
- Corpus of Open Access articles from multiple fields in Science, Technology, and Medicine.☆72Updated 7 years ago
- An implementation of latent Dirichlet allocation in javascript☆183Updated 2 years ago
- ☆97Updated 3 years ago
- Project on the history of genre.☆22Updated 4 years ago