dbamman / book-nlpLinks
Natural language processing pipeline for book-length documents (archival Java version; for current Python version, see: https://github.com/booknlp/booknlp)
☆316Updated 3 years ago
Alternatives and similar repositories for book-nlp
Users that are interested in book-nlp are comparing it to the libraries listed below
Sorting:
- Collection of tools for building diachronic/historical word vectors☆443Updated 2 years ago
- Sample implementation of a politeness model, trained on the Stanford Politeness Corpus☆148Updated 3 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.☆368Updated 3 years ago
- Various utilities for processing the data.☆216Updated this week
- A command-line program to download text corpora.☆34Updated 8 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆84Updated last year
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆57Updated 3 weeks ago
- An implementation of latent Dirichlet allocation in javascript☆185Updated 3 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated 2 years ago
- Natural language processing resources for multiple languages, with an eye towards use for digital humanities.☆127Updated 4 years ago
- English data☆218Updated 2 weeks ago
- Automatically exported from code.google.com/p/universal-pos-tags☆130Updated 3 years ago
- 🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the…☆249Updated 2 years ago
- The Art of Literary Text Analysis☆168Updated 6 years ago
- Socially-Equitable Language Identification☆78Updated 2 years ago
- A Python wrapper around the topic modeling functions of MALLET.☆105Updated last year
- ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with…☆75Updated 3 months ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- Topic Words in Context (TWiC) is a highly-interactive, browser-based visualization for MALLET topic models☆51Updated 8 years ago
- Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies☆69Updated 6 years ago
- Python port of the Twokenize class of ark-tweet-nlp☆142Updated 7 years ago
- Take a MALLET to disciplinary history☆99Updated 3 years ago
- Community Curated NLP List☆201Updated 3 years ago
- analyze text with empath☆339Updated 8 years ago
- ☆98Updated 4 years ago
- A simple interface to the Project Gutenberg corpus.☆331Updated 3 years ago
- ☆59Updated 10 years ago
- Netherlands eScience Center - Shifting Concepts Through Time project☆27Updated 3 years ago
- Corpus of Open Access articles from multiple fields in Science, Technology, and Medicine.☆74Updated 8 years ago
- Universal Dependencies online documentation☆287Updated this week