loretoparisi / fastLangID
Stand-alone Language Identification for Node.js JavaScript based on FastText
β7Updated 6 years ago
Alternatives and similar repositories for fastLangID:
Users that are interested in fastLangID are comparing it to the libraries listed below
- πNeural Sentential Paraphrase Generation to Augment Chatbot Training Datasetβ21Updated 2 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languagesβ10Updated last year
- Tensorflow Node.js Examplesβ25Updated 2 years ago
- A web interface to understand language-specific BERT-modelsβ17Updated 11 months ago
- Bilingual sentence similarity classifier using Tensorflowβ21Updated 5 years ago
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon aβ¦β20Updated 5 months ago
- Training a model without a dataset for natural language inference (NLI)β25Updated 4 years ago
- Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.β28Updated 8 years ago
- English lexicon useful in NLP/NLUβ15Updated last year
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of β¦β61Updated 4 years ago
- A simple neural truecaser written in pytorch and allennlp.β33Updated 9 months ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.β86Updated 3 years ago
- The WebSplit Benchmark introducing "Split and Rephrase" taskβ63Updated 6 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β48Updated 3 years ago
- Expletives vomiting library...β13Updated 7 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotationβ11Updated 2 weeks ago
- Automated paraphrases Generationβ36Updated 2 years ago
- CorrectLy - Open Source Spelling & Grammar correctionβ40Updated 2 years ago
- This repo is containing notes and implementations for cherry-picked publications of my particular interestβ12Updated 4 years ago
- c++ mosestokenizerβ17Updated last year
- "Zero-Training Sentence Embedding via Orthogonal Basis" paper implementationβ19Updated 6 years ago
- A curated list of Natural Language Generation papers, tutorials, and blogs.β12Updated 6 years ago
- Corpus preprocessingβ96Updated last year
- Morfessor EM+Pruneβ10Updated 4 years ago
- Python SDK for the TextRazor Text Analytics APIβ20Updated last year
- Code for extracting parallel corpora from pmindiaβ16Updated 5 years ago
- List of corpora annotated for coreference for different languagesβ17Updated 8 months ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translationβ14Updated 7 months ago
- PANiC - PAraphrasing Noun-Compoundsβ15Updated 7 years ago
- A raspberry pi 64bit image with spacy and neuralcoref pre-installedβ21Updated 5 years ago