nlp-compromise / nlp-corpus
varied english texts for modern NLP testing
☆75Updated 2 years ago
Alternatives and similar repositories for nlp-corpus:
Users that are interested in nlp-corpus are comparing it to the libraries listed below
- Markov Chain combined with word vector embedding (word2vec) and part-of-speech tagging, for context-aware text generation. License: MIT☆99Updated 7 years ago
- Text summarization using Lexrank☆54Updated 6 years ago
- TextRank algorithm implementation in Javascript☆41Updated 9 years ago
- WordNet in JSON format.☆91Updated 4 years ago
- Python SDK for the TextRazor Text Analytics API☆20Updated last year
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆111Updated 3 weeks ago
- The Non-Official Characterization (NOC) List is a knowledge-base containing semantic triples about famous people, living and dead, fictio…☆24Updated 6 years ago
- This project represents the 300-dimensional word vectors from word2vec as JSON.☆123Updated 8 years ago
- Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts☆59Updated 12 years ago
- Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO☆225Updated 2 years ago
- A tool for visualizing trees, tailored specifically to the analysis of parse trees.☆81Updated 4 years ago
- A Javascript Implementation of the Porter Stemmer☆96Updated 3 years ago
- A thin GraphQL wrapper around spacy☆21Updated 4 years ago
- A raspberry pi 64bit image with spacy and neuralcoref pre-installed☆21Updated 5 years ago
- generate rules from lists of words☆16Updated 3 years ago
- command-line tool to extract taxonomies from Wikidata☆126Updated 5 years ago
- A modular annotation system that supports complex, interactive annotation graphs embedded on top of sequences of text.☆92Updated 3 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆152Updated 3 months ago
- My Part of Speech Tagger☆42Updated 8 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- English lexicon useful in NLP/NLU☆15Updated last year
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆51Updated 3 years ago
- Quill's library of open source NLP algorithms and data sets.☆52Updated 10 months ago
- Python library for Natural Language Generation (including SimpleNLG wrapper)☆44Updated 2 years ago
- Contextual Graph Knowledge Base☆86Updated 7 years ago
- 🤹♀️ Query spaCy's linguistic annotations using GraphQL☆86Updated 6 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 5 years ago
- fasttag part of speech tagger javascript implementation☆65Updated 8 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆61Updated 9 months ago
- A tool for analyzing the word histories of a text.☆34Updated 2 months ago