mhagiwara / nanigonetLinks
NanigoNet β Language detector for code-mixed input supporting 150+19 human+programming languages using deep neural networks
β71Updated 2 years ago
Alternatives and similar repositories for nanigonet
Users that are interested in nanigonet are comparing it to the libraries listed below
Sorting:
- π A demonstration of hyperparameter optimization using Optuna for models implemented with AllenNLP.β16Updated 5 years ago
- numeric fused-head identification and resolutionβ33Updated 6 years ago
- Code and datasets of "Multilingual Extractive Reading Comprehension by Runtime Machine Translation"β40Updated 7 years ago
- Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.β127Updated 5 years ago
- LM Pretraining with PyTorch/TPUβ137Updated 6 years ago
- Automatic extraction of edited sentences from text edition histories.β83Updated 3 years ago
- A lightweight but powerful library to build token indices for NLP tasks, compatible with major Deep Learning frameworks like PyTorch and β¦β51Updated last year
- Robsut Wrod Reocginiton via semi-Character Recurrent Neural Networkβ21Updated 8 years ago
- Language model powered proof reader for correcting contextual errors in natural language.β24Updated 2 years ago
- Train transformer-based models.β28Updated last week
- A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contaiβ¦β105Updated 6 years ago
- SImple SenTence EmbeddeRβ74Updated 2 years ago
- An Interactive Tool for Scalable and Reproducible Error Analysis.β109Updated 4 years ago
- Hyperparameter Search for AllenNLPβ140Updated 10 months ago
- Code for obtaining the Curation Corpus abstractive text summarisation datasetβ128Updated 5 years ago
- Code and data for segmentation experiments.β20Updated 10 years ago
- A embed able annotation tool for end to end cross document co-referenceβ42Updated 2 years ago
- BERT models for many languages created from Wikipedia textsβ33Updated 5 years ago
- A Python implementation of the SimString, a simple and efficient algorithm for approximate string matching.β125Updated 2 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.β86Updated 4 years ago
- Viewer for the π€ datasets library.β86Updated 4 years ago
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inferenceβ61Updated 3 years ago
- A collection of selected of models built with AllenNLP.β25Updated 5 years ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".β64Updated 5 years ago
- Code for bidirectional sequence generation (BiSon) for generating from BERT pre-trained models.β51Updated 5 years ago
- A Corpus for Multilingual Document Classification in Eight Languages.β152Updated 3 years ago
- Many Natural Language Processing tasks rely on sentence boundary detection (SBD). Although amazing libraries like spacy provide state of β¦β62Updated 5 years ago
- Examples for aligning, padding and batching sequence labeling data (NER) for use with pre-trained transformer modelsβ64Updated 3 years ago
- β18Updated 2 years ago
- Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transferβ39Updated 5 years ago