davidsbatista / lexiconsLinks

Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.

☆28

Alternatives and similar repositories for lexicons

Users that are interested in lexicons are comparing it to the libraries listed below

Sorting:

mayhewsw / pytorch-truecaser
A simple neural truecaser written in pytorch and allennlp.
☆33Updated last year
Tianwei-She / awesome-natural-language-generation
A curated list of Natural Language Generation papers, tutorials, and blogs.
☆12Updated 6 years ago
brmson / question-classification
☆31Updated 8 years ago
vincent9514 / Text-Variant-Generation
📄Neural Sentential Paraphrase Generation to Augment Chatbot Training Dataset
☆21Updated 2 years ago
laugustyniak / textlytics
Text processing library for sentiment analysis and related tasks
☆27Updated 6 years ago
ghaddarAbs / WiNER
☆33Updated 3 years ago
ffancellu / NegNN
Neural Network for Automatic Negation Detection
☆20Updated 8 years ago
jhlau / topic-coherence-sensitivity
Code to compute topic coherence for several topic cardinalities and aggregate scores across them
☆22Updated 4 months ago
alvations / expletives
Expletives vomiting library...
☆13Updated 8 years ago
pdasigi / onto-lstm
Keras implementation of ontology aware token embeddings
☆49Updated 6 years ago
muelletm / cistern
Open-source tools for morphological tagging, segmentation and stemming.
☆40Updated 6 years ago
GorkaUrbizu / Coreference-Corpora-Resources
List of corpora annotated for coreference for different languages
☆17Updated 11 months ago
TurkuNLP / wikibert
BERT models for many languages created from Wikipedia texts
☆33Updated 5 years ago
jacobeisenstein / bayes-seg
Java code from the 2008 EMNLP paper "Bayesian Unsupervised Topic Segmentation" by Eisenstein and Barzilay
☆36Updated 9 years ago
mheilman / tan-clustering
Hierarchical word clustering, following "Brown clustering" (Brown et al., 1992)
☆69Updated 10 years ago
BramVanroy / spacy-extreme
An example of how to use spaCy for extremely large files without running into memory issues
☆36Updated 2 years ago
mfaruqui / non-distributional
Non-distributional linguistic word vector representations.
☆62Updated 7 years ago
360er0 / COMBO
COMBO is jointly trained tagger, lemmatizer and dependency parser.
☆35Updated 2 years ago
utkd / encdecmodel-hf
☆34Updated 4 years ago
Noahs-ARK / idea_relations
A framework to identify relations between ideas in temporal text corpora.
☆28Updated 7 years ago
harkous / embeddingsviz
Visualize word embeddings of a vocabulary in TensorBoard, including the neighbors
☆46Updated 7 years ago
iai-group / DynamicEntitySummarization-DynES
Dynamic Entity Summarization (DynES)
☆20Updated 6 years ago
dbmdz / deep-eos
General-Purpose Neural Networks for Sentence Boundary Detection
☆73Updated 2 years ago
jaredleekatzman / Wordly
ADS Project
☆14Updated 9 years ago
ajaech / calm
Context Aware Language Models
☆28Updated 7 years ago
FerreroJeremy / Cross-Language-Dataset
A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection
☆60Updated 8 years ago
cltl / wsd-dynamic-sense-vector
☆25Updated 2 years ago
akb89 / pyfn
A python module to process data for Frame Semantic Parsing
☆24Updated 4 years ago
claravania / subword-lstm-lm
LSTM Language Model with Subword Units Input Representations
☆42Updated 4 years ago
lverwimp / tf-lm
Language modeling scripts based on TensorFlow
☆58Updated 5 years ago