loretoparisi / fastLangID
Stand-alone Language Identification for Node.js JavaScript based on FastText
β7Updated 5 years ago
Related projects: β
- πNeural Sentential Paraphrase Generation to Augment Chatbot Training Datasetβ22Updated last year
- Training a model without a dataset for natural language inference (NLI)β25Updated 4 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languagesβ11Updated 7 months ago
- Bilingual sentence similarity classifier using Tensorflowβ19Updated 4 years ago
- A web interface to understand language-specific BERT-modelsβ17Updated 5 months ago
- Dictionaries of names, surnames, acronyms and it's extensions, stop-words, etc., which I gathered for different experiments.β29Updated 7 years ago
- A simple neural truecaser written in pytorch and allennlp.β31Updated 3 months ago
- BERT models for many languages created from Wikipedia textsβ34Updated 4 years ago
- List of corpora annotated for coreference for different languagesβ16Updated last month
- A text similarity computation using minhashing and Jaccard distance on reuters datasetβ16Updated 6 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of β¦β61Updated 4 years ago
- Python SDK for the TextRazor Text Analytics APIβ20Updated last year
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon aβ¦β20Updated 2 years ago
- Code for extracting parallel corpora from pmindiaβ16Updated 4 years ago
- Conversational dataset from the Chit-Chat Challengeβ25Updated 11 months ago
- Hi. I am jann. I am text input - text output chatbot model that is JUST approximate nearest neighbour.β35Updated 2 years ago
- Extremely easy to use sequence to sequence library with attention, for text to text conversion tasks.β39Updated 3 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.β85Updated 3 years ago
- Generate BERT vocabularies and pretraining examples from Wikipediasβ18Updated 4 years ago
- As good as new. How to successfully recycle English GPT-2 to make models for other languages (ACL Findings 2021)β46Updated 3 years ago
- πNeural Text Simplification to Improve Chatbot Performanceβ13Updated 6 years ago
- PANiC - PAraphrasing Noun-Compoundsβ15Updated 6 years ago
- This repo contains the code used to generate the French Wikipedia sample used in the QA annotation project PIAFβ11Updated 3 years ago
- OKR: A Consolidated Open Knowledge Representation for Multiple Textsβ39Updated 6 years ago
- A collection of English tweets annotated in Universal Dependencies.β39Updated 2 years ago
- numeric fused-head identification and resolutionβ33Updated 4 years ago
- β33Updated 3 years ago
- A question-answering dataset with a focus on subjective informationβ43Updated 8 months ago
- Deep neural parser for database queryβ19Updated last year
- The WebSplit Benchmark introducing "Split and Rephrase" taskβ64Updated 5 years ago