davidsbatista / lexiconsLinks
Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.
β28Updated 8 years ago
Alternatives and similar repositories for lexicons
Users that are interested in lexicons are comparing it to the libraries listed below
Sorting:
- A simple neural truecaser written in pytorch and allennlp.β33Updated last year
- πNeural Sentential Paraphrase Generation to Augment Chatbot Training Datasetβ21Updated 3 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.β86Updated 4 years ago
- BERT models for many languages created from Wikipedia textsβ33Updated 5 years ago
- Build a dialog dataset from online books in many languagesβ76Updated 3 years ago
- A collection of English tweets annotated in Universal Dependencies.β39Updated 4 years ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.β31Updated 5 years ago
- Code and data used in named entity transliteration experimentsβ57Updated 7 years ago
- Scripts and tools for doing unsupervised acceptability prediction.β14Updated 2 years ago
- Language modeling scripts based on TensorFlowβ58Updated 6 years ago
- β32Updated 4 years ago
- β34Updated 5 years ago
- Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions oβ¦β103Updated 2 years ago
- Open-source tools for morphological tagging, segmentation and stemming.β40Updated 6 years ago
- A curated list of Natural Language Generation papers, tutorials, and blogs.β12Updated 7 years ago
- A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contaiβ¦β105Updated 6 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtitiesβ118Updated 5 months ago
- Text processing library for sentiment analysis and related tasksβ27Updated 7 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.β155Updated last year
- Unofficial implementation of Adaptive Input in PyTorchβ12Updated 6 years ago
- β31Updated 8 years ago
- An example of how to use spaCy for extremely large files without running into memory issuesβ36Updated 3 years ago
- Keras implementation of ontology aware token embeddingsβ49Updated 7 years ago
- COMBO is jointly trained tagger, lemmatizer and dependency parser.β35Updated 2 years ago
- c++ mosestokenizerβ18Updated last year
- General-Purpose Neural Networks for Sentence Boundary Detectionβ73Updated 2 years ago
- A Benchmark Dataset for Understanding Disfluencies in Question Answeringβ64Updated 4 years ago
- Code to compute topic coherence for several topic cardinalities and aggregate scores across themβ22Updated 3 months ago
- This repo contains code and dataset for the Opinosis Summarization Frameworkβ51Updated 6 years ago
- A tiny BERT for low-resource monolingual modelsβ31Updated last week