mrinaldhar/en-hi-codemixed-corpus

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/mrinaldhar/en-hi-codemixed-corpus)

mrinaldhar / en-hi-codemixed-corpus

Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus

☆13

Alternatives and similar repositories for en-hi-codemixed-corpus

Users that are interested in en-hi-codemixed-corpus are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OSDG-IIITH / TechX
View on GitHub
This repository contains all resources corresponding to the various TechX sessions at IIIT Hyderabad
☆19Dec 12, 2018Updated 7 years ago
irshadbhat / csnli
View on GitHub
Language identification and normalisation in code switching data tailored with a three-step decoding process
☆24Dec 23, 2019Updated 6 years ago
chsasank / indic-transliteration
View on GitHub
Hindi-English Transliteration Using sequence to sequence learning
☆17Apr 3, 2017Updated 9 years ago
anoopkunchukuttan / crowd-indic-transliteration-data
View on GitHub
Xlit-Crowd: Hindi-English Transliteration Corpus
☆38Feb 17, 2015Updated 11 years ago
AI4Bharat / indic-bart
View on GitHub
Pre-trained, multilingual sequence-to-sequence models for Indian languages
☆51Jul 20, 2022Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
irshadbhat / litcm
View on GitHub
Language Identification and transliteration tool for Indian language code mixed data.
☆24Feb 29, 2016Updated 10 years ago
piyushmakhija5 / hinglishNorm
View on GitHub
A Hindi-English Dataset for Text Normalization
☆18Jan 3, 2022Updated 4 years ago
anoopkunchukuttan / geomm
View on GitHub
Geometry-aware Multilingual Embeddings
☆26Dec 8, 2022Updated 3 years ago
microsoft / CodeMixed-Text-Generator
View on GitHub
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalenc…
☆62Jul 30, 2024Updated last year
SAP / software-documentation-data-set-for-machine-translation
View on GitHub
A parallel evaluation data set of SAP software documentation with document structure annotation
☆15Jun 12, 2026Updated last month
microsoft / GLUECoS
View on GitHub
A benchmark for code-switched NLP, ACL 2020
☆76May 28, 2024Updated 2 years ago
ajinkyakulkarni14 / How-I-Extracted-TED-talks-for-parallel-Corpus-
View on GitHub
☆34Nov 29, 2016Updated 9 years ago
precog-iiith / hindi-english-code-mixing-lidf-ner
View on GitHub
☆10Aug 1, 2018Updated 7 years ago
MysteryVaibhav / robust_mtnt
View on GitHub
Code for the paper "Improving Robustness of Machine Translation with Synthetic Noise"
☆21Dec 23, 2019Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jerinphilip / ilmulti
View on GitHub
Tooling to play around with multilingual machine translation for Indian Languages.
☆22Mar 5, 2022Updated 4 years ago
JasonForJoy / FIRE
View on GitHub
EMNLP 2020: Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots
☆12Dec 15, 2020Updated 5 years ago
precog-iiith / hindi-english-code-mixed-POS-tagging
View on GitHub
POS tagging models for Hindi English Code Mixed Tweets
☆11Aug 1, 2018Updated 7 years ago
shuyanzhou / multitask_transformer
View on GitHub
Source code for "Improving Robustness of Neural Machine Translation with Multi-task Learning"
☆19Aug 15, 2019Updated 6 years ago
y3ro / meemi
View on GitHub
Improving cross-lingual word embeddings by meeting in the middle
☆23Aug 25, 2020Updated 5 years ago
ayushidalmia / Phrase-Based-Model
View on GitHub
Implementation of Phrase Based Model to translate sentences from English to German and vice versa
☆12May 23, 2014Updated 12 years ago
spyysalo / wiki-bert-pipeline
View on GitHub
Generate BERT vocabularies and pretraining examples from Wikipedias
☆17May 11, 2020Updated 6 years ago
ltrc / indic-wx-converter
View on GitHub
Python library for converting UTF to WX and vice-versa for Indian languages.
☆11Jan 17, 2019Updated 7 years ago
marshallwhiteorg / emnlp19-media-bias
View on GitHub
Code and data for the EMNLP 2019 paper "In Plain Sight: Media Bias Through the Lens of Factual Reporting"
☆10Feb 15, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
graviraja / nlp-paper-summary
View on GitHub
☆16Oct 12, 2020Updated 5 years ago
mallika2011 / Archive-Abode
View on GitHub
Archive of my notes taken at lectures in IIITH
☆25Aug 15, 2021Updated 4 years ago
zorazrw / multilingual-conala
View on GitHub
[EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages
☆23Feb 13, 2023Updated 3 years ago
aizhanti / JaRuNC
View on GitHub
Japanese--Russian--English News Commentary Parallel Data
☆18Jul 9, 2019Updated 7 years ago
bert-nmt / ctx-bert-nmt
View on GitHub
Extend bert-nmt to context-aware translation.
☆11May 24, 2021Updated 5 years ago
stared / which-ml-are-you
View on GitHub
Which ML are you?
☆13Jan 3, 2023Updated 3 years ago
yunsukim86 / sockeye-transfer
View on GitHub
Transfer learning for neural machine translation using cross-lingual word embeddings
☆10Dec 17, 2025Updated 7 months ago
mlberkeley / deepart-workshop
View on GitHub
Making Art with Deep Learning Workshop | ML@B
☆26Feb 22, 2018Updated 8 years ago
Kartikaggarwal98 / Indian_ParallelCorpus
View on GitHub
Curated list of publicly available parallel corpus for Indian Languages
☆36Jul 15, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hclent / CreoleVal
View on GitHub
The central repo for Creole based NLU and NLG work
☆18May 2, 2025Updated last year
sayarghoshroy / Intro_to_DL_tutorial
View on GitHub
Material for introduction to Deep Learning Tutorials, Summer '20, '21
☆16Jul 2, 2021Updated 5 years ago
epfl-exts / rl-workshop
View on GitHub
Reinforcement Learning workshop @ amld20
☆25Oct 3, 2023Updated 2 years ago
NirantK / Hinglish
View on GitHub
Hinglish Text Classification
☆30Jun 12, 2023Updated 3 years ago
Joon-Park92 / Zero-Shot-Translation-Transformer
View on GitHub
Zero-Shot Translation implemented by Transformer
☆14Mar 24, 2023Updated 3 years ago
praatibhsurana / Hinglish_Hindi_WSD
View on GitHub
A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…
☆37Jan 14, 2024Updated 2 years ago
shuoyangd / tape4nmt
View on GitHub
a ducttape workflow for neural machine translation
☆14Mar 23, 2021Updated 5 years ago