pgcorpus / gutenberg-analysis
Analysis of gutenberg dataset
☆44Updated 6 years ago
Alternatives and similar repositories for gutenberg-analysis
Users that are interested in gutenberg-analysis are comparing it to the libraries listed below
Sorting:
- ☆17Updated last year
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Do …☆80Updated 10 months ago
- Wikipedia based dataset to train relationship classifiers and fact extraction models☆25Updated 3 years ago
- An implementation of GrASP (Shnarch et. al., 2017)☆21Updated 2 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 2 years ago
- Align the token outputs from Spacy and Huggingface to help understand what language structures transformers see☆44Updated 3 years ago
- ZS4IE: A Toolkit for Zero-Shot Information Extraction with Simple Verbalizations☆27Updated 3 years ago
- MinScIE is an Open Information Extraction system which provides structured knowledge enriched with semantic information about citations.☆15Updated 5 years ago
- classy is a simple-to-use library for building high-performance Machine Learning models in NLP.☆87Updated last month
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.☆69Updated 3 years ago
- REMERGE - Multi-Word Expression discovery algorithm☆14Updated 2 years ago
- Finds linguistic patterns effortlessly☆36Updated last year
- Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.☆44Updated last year
- ☆64Updated 2 years ago
- Bayesian Assessment of Hypotheses☆24Updated last year
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆57Updated last year
- Data programming by demonstration for information extraction and span annotation☆35Updated 3 years ago
- Linguistic and stylistic complexity measures for (literary) texts☆81Updated last year
- SciWING is a modern toolkit for scientific document processing from WING-NUS☆63Updated 2 years ago
- This repository hosts the code for a tokenizer of tweets.☆12Updated 6 years ago
- Fastlaw's purpose is to replace generic word embeddings for work on supervised machine learning NLP-tasks with legal texts.☆38Updated 5 years ago
- ParaNames: A multilingual resource for parallel names☆32Updated 11 months ago
- How Contextual are Contextualized Word Representations?☆41Updated 5 years ago
- ☆19Updated 3 years ago
- A Super-Lightweight Annotation Tool for Experts: Label text in a terminal with just Python☆101Updated 4 months ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- ☆76Updated 3 years ago
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and …☆51Updated 5 months ago
- A corpus and code for understanding norms and subjectivity. 🤖☆49Updated 7 months ago