davidsbatista / lexiconsLinks
Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.
☆28Updated 8 years ago
Alternatives and similar repositories for lexicons
Users that are interested in lexicons are comparing it to the libraries listed below
Sorting:
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- BERT models for many languages created from Wikipedia texts☆33Updated 5 years ago
- ☆32Updated 4 years ago
- Build a dialog dataset from online books in many languages☆76Updated 3 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 3 years ago
- Open-source tools for morphological tagging, segmentation and stemming.☆40Updated 6 years ago
- Scripts and tools for doing unsupervised acceptability prediction.☆14Updated 2 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 5 years ago
- Unofficial implementation of Adaptive Input in PyTorch☆12Updated 6 years ago
- A collection of English tweets annotated in Universal Dependencies.☆39Updated 4 years ago
- Neural network sequence labeling model☆11Updated 6 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of …☆62Updated 5 years ago
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference☆61Updated 3 years ago
- General-Purpose Neural Networks for Sentence Boundary Detection☆73Updated 2 years ago
- Language modeling scripts based on TensorFlow☆58Updated 6 years ago
- Pre-trained models and code and data to train and use models from "Pushing the Limits of Paraphrastic Sentence Embeddings with Millions o…☆103Updated 2 years ago
- ☆31Updated 8 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆155Updated last year
- An implementation of GrASP (Shnarch et. al., 2017)☆23Updated 3 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 4 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Keras implementation of ontology aware token embeddings☆49Updated 7 years ago
- BlackboxNLP 2019: Analyzing and interpreting neural networks for NLP☆18Updated 6 years ago
- A tool for text normalisation via character-level machine translation☆13Updated 5 years ago
- ☆18Updated 2 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆76Updated 2 years ago
- Implementation of a simple frame identification approach (SimpleFrameId) described in the paper "Out-of-domain FrameNet Semantic Role Lab…☆15Updated 8 years ago
- Gamma Agreement in Python☆45Updated last year
- Code and data used in named entity transliteration experiments☆57Updated 7 years ago