AI4Bharat / IndicLID
Language Identification for Indian languages
☆16Updated last year
Alternatives and similar repositories for IndicLID:
Users that are interested in IndicLID are comparing it to the libraries listed below
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆59Updated 6 months ago
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆46Updated 7 months ago
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆106Updated 6 months ago
- Translation models for 22 scheduled languages of India☆304Updated last month
- Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME☆94Updated 2 weeks ago
- Transliteration models for 21 Indic languages☆87Updated last year
- A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…☆36Updated last year
- Hinglish Text Classification☆30Updated last year
- A simple, consistent and extendable toolkit for IndicTrans2☆25Updated last month
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆125Updated last year
- Transcribe your videos and translate it into Indic languages.☆30Updated this week
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year
- Description Describes the IndicNLP corpus and associated datasets☆167Updated 2 years ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 3 years ago
- Repository for fine-tuning gemma models using unsloth for indic languages☆89Updated last year
- aiXplain enables python programmers to add AI functions to their software.☆43Updated this week
- Shoonya - Platform to Annotate and label data at scale.☆54Updated 7 months ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 2 years ago
- MAFAND-MT☆55Updated 9 months ago
- A collaborative catalog of NLP resources for Indic languages☆586Updated 4 months ago
- Python package for indic script transliteration☆178Updated 3 weeks ago
- Chitralekha - A video transcreation platform for Indic languages, supporting transcription, translation and voice-over☆101Updated 3 months ago
- SeeGULL is a broad-coverage stereotype dataset in English containing stereotypes about identity groups spanning 178 countries across 8 di…☆33Updated last year
- Using short models to classify long texts☆21Updated 2 years ago
- Smart commit messages☆18Updated 6 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆120Updated this week
- The IIT Bombay English-Hindi Parallel Corpus☆19Updated 3 years ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆77Updated 2 weeks ago
- Code Repository for the IndicXNLI paper.☆15Updated last year
- Chunk your text using gpt4o-mini more accurately☆44Updated 8 months ago