joshua-decoder/indian-parallel-corpora

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/joshua-decoder/indian-parallel-corpora)

joshua-decoder / indian-parallel-corpora

☆70

Alternatives and similar repositories for indian-parallel-corpora

Users that are interested in indian-parallel-corpora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nlpcuom / English-Tamil-Parallel-Corpus
View on GitHub
☆14Jan 4, 2021Updated 5 years ago
anikethjr / NER_Telugu
View on GitHub
An LSTM-CRF classifier for NER in Telugu, an Indian language.
☆15Sep 4, 2022Updated 3 years ago
bidishasamantakgp / VACS
View on GitHub
Code and data for "A Deep Generative Model for Code-Switched Text" accepted in IJCAI 2019
☆16Nov 14, 2019Updated 6 years ago
divyanshuaggarwal / IndicXNLI
View on GitHub
Code Repository for the IndicXNLI paper.
☆15Jul 8, 2023Updated 3 years ago
AnushaMotamarri / Telugu-Books-Dataset
View on GitHub
This project scrapes text from Telugu books(Novels)
☆10Aug 3, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
puneetsl / Romadeva
View on GitHub
It is a simple tool to convert roman script to indic(Devanagari) script. As most Keyboards are English and to write in Indic script is di…
☆13Aug 31, 2016Updated 9 years ago
irshadbhat / litcm
View on GitHub
Language Identification and transliteration tool for Indian language code mixed data.
☆24Feb 29, 2016Updated 10 years ago
ecscstatsconsulting / morphemes
View on GitHub
A practical python library for identifying morphemes.
☆13Mar 11, 2023Updated 3 years ago
ltrc / indic-wx-converter
View on GitHub
Python library for converting UTF to WX and vice-versa for Indian languages.
☆11Jan 17, 2019Updated 7 years ago
Bollegala / DARep
View on GitHub
Cross-domain word representation learning
☆10May 23, 2015Updated 11 years ago
violet-zct / DeMa-BWE
View on GitHub
NAACL 2019 paper: Density Matching for Bilingual Word Embedding (Zhou et al., 2019)
☆63Dec 8, 2022Updated 3 years ago
jacklxc / StandAloneSpellingCorrection
View on GitHub
Repository for Findings of EMNLP 2020 "Context-aware Stand-alone Neural Spelling Correction"
☆18Dec 21, 2020Updated 5 years ago
AI4Bharat / indicnlp_catalog
View on GitHub
A collaborative catalog of NLP resources for Indic languages
☆638Dec 14, 2024Updated last year
tnq177 / witwicky
View on GitHub
Witwicky: An implementation of Transformer in PyTorch.
☆22Aug 17, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
shahparth123 / eng_guj_parallel_corpus
View on GitHub
This repository contains dataset for english to gujarati translation
☆10Dec 27, 2020Updated 5 years ago
Abhishekmamidi123 / Natural-Language-Processing
View on GitHub
Language Modelling, CMI vs Perplexity
☆11Mar 17, 2018Updated 8 years ago
facebookresearch / evaluation-of-nmt-bt
View on GitHub
This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …
☆15Aug 31, 2021Updated 4 years ago
sanskrit-coders / chandas
View on GitHub
Sanskrit metre : miscellaneous code and data.
☆22Mar 27, 2026Updated 3 months ago
szilard / ML-scoring
View on GitHub
Compare the scoring speed of several open source machine learning libraries.
☆19Jun 19, 2017Updated 9 years ago
mrinaldhar / en-hi-codemixed-corpus
View on GitHub
Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus
☆13Feb 17, 2019Updated 7 years ago
neulab / covid19-datashare
View on GitHub
A repo for sharing language resources related to the outbreak (in machine readable format)
☆25Sep 22, 2025Updated 10 months ago
java10000 / semantic_similarity_based_on_ANN
View on GitHub
基于人工神经网络的中文语义相似度计算研究
☆11Apr 1, 2013Updated 13 years ago
helmertz / querysum
View on GitHub
☆14Jun 9, 2017Updated 9 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
vigneshwaran-chandrasekaran / tamil-language-words-list
View on GitHub
Tamil Language words list
☆12Jul 2, 2016Updated 10 years ago
cisnlp / parcoure
View on GitHub
ParCourE - Parallel Corpus Explorer
☆12Dec 27, 2021Updated 4 years ago
notAI-tech / DeepTranslit
View on GitHub
Efficient and easy to use transliteration for Indian languages
☆49Aug 7, 2020Updated 5 years ago
midas-research / hindi-nli-data
View on GitHub
a repository containing the details of natural language inference dataset in Hindi
☆14Dec 28, 2020Updated 5 years ago
libindic / Transliteration
View on GitHub
Transliteration module for Indian Languages
☆79Oct 24, 2025Updated 9 months ago
skit-ai / speech-recognition
View on GitHub
SDKs and docs for Skit's speech to text service
☆21Jul 5, 2023Updated 3 years ago
Unbabel / MT-Telescope
View on GitHub
☆33Nov 22, 2021Updated 4 years ago
anoopkunchukuttan / indic_nlp_library
View on GitHub
Resources and tools for Indian language Natural Language Processing
☆639Jun 7, 2024Updated 2 years ago
libindic / indic-trans
View on GitHub
The project aims on adding a state-of-the-art transliteration module for cross transliterations among all Indian languages including Engl…
☆275Oct 28, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AI4Bharat / Indic-BERT-v1
View on GitHub
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.c…
☆297May 11, 2023Updated 3 years ago
univai-ghf / PythonWorkshop
View on GitHub
The GHF Python Workshop
☆11Nov 10, 2022Updated 3 years ago
rbawden / DiaBLa-dataset
View on GitHub
English-French MT dialogue dataset
☆17Apr 29, 2022Updated 4 years ago
Helsinki-NLP / MuCoW
View on GitHub
Automatically harvested multilingual contrastive word sense disambiguation test sets for machine translation
☆18Jan 18, 2021Updated 5 years ago
shibei00 / Cross-Lingual-Topic-Model
View on GitHub
A topic model which can identify bilingual topics across unaligned corpus using dictionary. An implementation of the paper "Detecting Com…
☆14Oct 25, 2017Updated 8 years ago
notAI-tech / IndicASR
View on GitHub
Speeech Recognition for Indic languages.
☆13Apr 3, 2021Updated 5 years ago
goru001 / inltk
View on GitHub
Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer m…
☆838Jan 20, 2024Updated 2 years ago