berkmancenter / corpusbuilder
Corpus Build OCR platform
☆8Updated last year
Related projects ⓘ
Alternatives and complementary repositories for corpusbuilder
- Visual analytics application for qualitative text analysis☆24Updated last year
- Statistical visualizations for Datasette using Seaborn☆11Updated 2 years ago
- A deep learning architecture for reference mining from literature in the arts and humanities.☆15Updated 5 years ago
- Python tools for text☆15Updated 4 years ago
- Performs unique entity estimation corresponding to Chen, Shrivastava, Steorts (2018).☆14Updated 5 years ago
- Scripts to take hand washing related text in (almost) any language and float it into a hand washing poster.☆9Updated 3 years ago
- A text processing pipeline for turning unstructured text data into hierarchical datasets☆15Updated 4 years ago
- Exploring textual and social measures of distance between genres.☆15Updated 5 years ago
- An alternative approach for probabilistic topic modeling based on agglomerative clustering of topics (not documents)☆12Updated 3 years ago
- IWAAN - An interactive Jupyter Notebook collection that allows to run analyses of Wikipedia article editing dynamics out-of-the-box on Bi…☆9Updated 6 months ago
- Entity linker for the newspaper collection of the National Library of the Netherlands. Links named entity mentions to DBpedia description…☆11Updated last year
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆24Updated 2 years ago
- ☆17Updated 3 weeks ago
- Compare accuracies of udpipe models and spacy models which can be used for NLP annotation☆14Updated 6 years ago
- ☆12Updated 5 years ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆15Updated last week
- R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance☆12Updated 8 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated last year
- MoodCat😼 classifies the mood of English sentences.☆14Updated 2 years ago
- A browser extension providing Open Access bibliographical services☆14Updated last year
- Python package used to convert Jupyter Noteboks into Jekyll ready documents including validation and version control tagging☆21Updated 6 years ago
- A financial disclosure data extraction tool.☆13Updated last year
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- TopicScan: Visualization and validation interface for NMF Topic Modeling☆23Updated 4 years ago
- Introduction to Topic Modeling for TextXD 2019, 12/3/2019☆10Updated 4 years ago
- A collection of scripts for teaching and learning basic text mining methods in R☆10Updated 6 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 2 years ago
- Implements the model described in "Identification, Interpretability, and Bayesian Word Embeddings"☆18Updated 5 years ago
- Visualize a corpus of texts as a landscape with the aid of text mining, graph visualization and self-organizing maps☆15Updated 2 years ago