shuyo / ldig
Language Detection with Infinity-gram
☆231Updated 9 years ago
Related projects ⓘ
Alternatives and complementary repositories for ldig
- Socially-Equitable Language Identification☆78Updated last year
- Python port of the Twokenize class of ark-tweet-nlp☆141Updated 6 years ago
- 💫 Scripts, tools and resources for developing spaCy☆125Updated 5 years ago
- ☆151Updated 4 years ago
- Python port of Mikolov's word2phrase.c from the word2vec toolkit☆112Updated 4 years ago
- Hunspell extension for spaCy 2.0.☆94Updated 3 months ago
- Twitter hashtag prediction☆281Updated 7 years ago
- ☆97Updated 3 years ago
- Named Entity Recognition data for Europeana Newspapers☆173Updated last year
- Tokenization and pre-processing for Twitter data used to train classifiers.☆71Updated 8 years ago
- A Multilingual and Multilevel Representation Learning Toolkit for NLP☆117Updated 6 years ago
- Making sense embedding out of word embeddings using graph-based word sense induction☆212Updated 3 years ago
- A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoLived" => ['#', 'The', 'Boy', 'Who', 'Lived']☆82Updated 8 years ago
- SymSpellCompound: compound aware automatic spelling correction☆66Updated 6 years ago
- a collection of functions that measure the readability of a given body of text☆191Updated 7 years ago
- Temporal Expression Recognition and Normalisation in Python☆78Updated 8 years ago
- Code accompanying our EMNLP paper Learning Language Representations for Typology Prediction☆71Updated 7 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.☆152Updated 3 weeks ago
- Fast Word Clustering Software☆74Updated 3 months ago
- Python bindings to the Compact Language Detector☆33Updated 4 years ago
- framework for doing NER and other types of entity recognition, in Python☆68Updated 2 years ago
- A toolkit for corpus linguistics☆199Updated 5 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆139Updated 2 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆90Updated 5 years ago
- Sample implementation of a politeness model, trained on the Stanford Politeness Corpus☆148Updated 2 years ago
- displaCy-ent.js: An open-source named entity visualiser for the modern web☆198Updated 6 years ago
- A Dependency Parser for Tweets☆79Updated 5 years ago