davidsbatista / lexiconsLinks
Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.
☆28Updated 8 years ago
Alternatives and similar repositories for lexicons
Users that are interested in lexicons are comparing it to the libraries listed below
Sorting:
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Open-source tools for morphological tagging, segmentation and stemming.☆40Updated 6 years ago
- 📄Neural Sentential Paraphrase Generation to Augment Chatbot Training Dataset☆21Updated 2 years ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 5 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- A collection of English tweets annotated in Universal Dependencies.☆39Updated 3 years ago
- Code and data used in named entity transliteration experiments☆57Updated 7 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)☆48Updated 4 years ago
- Embeddings for n-grams☆11Updated 7 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Updated 2 years ago
- Multi-lingual Text Processing☆96Updated 6 years ago
- Brown clustering in Python☆22Updated 7 years ago
- LSTM Language Model with Subword Units Input Representations☆42Updated 4 years ago
- ☆34Updated 4 years ago
- c++ mosestokenizer☆18Updated last year
- BERT models for many languages created from Wikipedia texts☆33Updated 5 years ago
- Code to compute topic coherence for several topic cardinalities and aggregate scores across them☆22Updated 6 months ago
- ☆31Updated 8 years ago
- Build a dialog dataset from online books in many languages☆76Updated 2 years ago
- A web interface to understand language-specific BERT-models☆18Updated last year
- Text processing library for sentiment analysis and related tasks☆27Updated 6 years ago
- Expletives vomiting library...☆13Updated 8 years ago
- A curated list of Natural Language Generation papers, tutorials, and blogs.☆12Updated 6 years ago
- Visualize word embeddings of a vocabulary in TensorBoard, including the neighbors☆46Updated 8 years ago
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆21Updated 4 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆90Updated 6 years ago
- Keras implementation of ontology aware token embeddings☆49Updated 6 years ago
- Hierarchical word clustering, following "Brown clustering" (Brown et al., 1992)☆70Updated 10 years ago
- Language modeling scripts based on TensorFlow☆58Updated 6 years ago