microsoft/LID-tool

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/LID-tool)

microsoft / LID-tool

This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.

☆60

Alternatives and similar repositories for LID-tool

Users that are interested in LID-tool are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

aparnadutta / code-mixed-lid
View on GitHub
Word-level language identification for Bangla-English code-mixed social media data, using a BiLSTM with subword embeddings.
☆10Aug 13, 2023Updated 2 years ago
fyvo / WMT-Biomed-Test
View on GitHub
☆13Aug 23, 2024Updated last year
irshadbhat / litcm
View on GitHub
Language Identification and transliteration tool for Indian language code mixed data.
☆24Feb 29, 2016Updated 10 years ago
microsoft / CodeMixed-Text-Generator
View on GitHub
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalenc…
☆62Jul 30, 2024Updated last year
sumanbanerjee1 / Code-Mixed-Dialog
View on GitHub
☆33Jun 20, 2018Updated 8 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
murali1996 / CodemixedNLP
View on GitHub
CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Switching
☆18Mar 29, 2021Updated 5 years ago
frozentoad9 / CMST
View on GitHub
Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages
☆13Oct 12, 2022Updated 3 years ago
sahilswami96 / SarcasmDetection_CodeMixed
View on GitHub
☆10Oct 2, 2017Updated 8 years ago
microsoft / GLUECoS
View on GitHub
A benchmark for code-switched NLP, ACL 2020
☆76May 28, 2024Updated 2 years ago
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
steve-wilson / nlpcss201-sm-preprocessing
View on GitHub
Materials from the NLPCSS 201 Social Media Preprocessing Tutorial, March 16, 2022
☆13Nov 10, 2022Updated 3 years ago
shengcanxu / canoSpeech
View on GitHub
text to speech
☆10Mar 19, 2024Updated 2 years ago
danielpreotiuc / complaints-social-media
View on GitHub
Research on Complaints in Social Media (ACL 2019)
☆15Aug 15, 2019Updated 6 years ago
sarulab-speech / Coco-Nut
View on GitHub
Coco-Nut (Corpus of connecting NIHONGO utterance and text) corpus
☆21Jun 12, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tmramalho / finetune-mbart
View on GitHub
How to finetune mbart using fairseq
☆25Dec 17, 2020Updated 5 years ago
NirantK / Hinglish
View on GitHub
Hinglish Text Classification
☆30Jun 12, 2023Updated 3 years ago
SilentFlame / Named-Entity-Recognition
View on GitHub
Corpus and a baseline neural network system for Named Entity Recognition in Hindi-English Code-Mixed social media text.
☆46Sep 25, 2020Updated 5 years ago
rnd2110 / MorphAGram
View on GitHub
A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars
☆17Jun 14, 2024Updated 2 years ago
gentaiscool / code-switching-papers
View on GitHub
A curated list of research papers and resources on code-switching
☆344Jan 31, 2026Updated 5 months ago
wabyking / word2fun
View on GitHub
☆11May 9, 2022Updated 4 years ago
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
amazon-science / iwslt-autodub-task
View on GitHub
☆21Mar 4, 2024Updated 2 years ago
kimbrianj / mlforsocialscience
View on GitHub
R package containing material for SURV 613: Machine Learning for Social Sciences
☆11Mar 24, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tatianapassali / artificial-disfluency-generation
View on GitHub
Generating artificial disfluencies from fluent text easily and promptly
☆16Sep 28, 2022Updated 3 years ago
mrinaldhar / en-hi-codemixed-corpus
View on GitHub
Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus
☆13Feb 17, 2019Updated 7 years ago
NJUNLP / INK
View on GitHub
☆14May 26, 2023Updated 3 years ago
duyichao / NPDA-KNN-ST
View on GitHub
Official implementation of EMNLP'2022 paper "Non-Parametric Domain Adaptation for End-to-End Speech Translation"
☆11Oct 26, 2022Updated 3 years ago
DDATT / Vits2-onnx-cpp
View on GitHub
Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++
☆19Apr 17, 2024Updated 2 years ago
osmanuygar / turkish-text-classification-api
View on GitHub
☆10Jan 19, 2023Updated 3 years ago
hate-alert / HateMM
View on GitHub
☆17Jun 17, 2024Updated 2 years ago
praatibhsurana / Hinglish_Hindi_WSD
View on GitHub
A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…
☆37Jan 14, 2024Updated 2 years ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
line / WaveTrainerFit
View on GitHub
Official implementation of "Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech G…
☆16Feb 6, 2026Updated 5 months ago
liuhuang31 / g2pw_once
View on GitHub
G2pw's inference speed is accelerated by about 8-10 times. Change loop generated predictive data to only once and model loop prediction b…
☆14Dec 30, 2023Updated 2 years ago
salesforce / adversarial-polyglots
View on GitHub
Code for the paper "Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots" (NAACL-HLT 2021)
☆10May 1, 2025Updated last year
irshadbhat / csnli
View on GitHub
Language identification and normalisation in code switching data tailored with a three-step decoding process
☆24Dec 23, 2019Updated 6 years ago
dmse4tts / DMSE4TTS
View on GitHub
☆24May 6, 2025Updated last year
kaistmm / AdaptVC
View on GitHub
☆17Jun 2, 2025Updated last year
juliamendelsohn / framing
View on GitHub
☆21Feb 9, 2022Updated 4 years ago