AI4Bharat/IndicLLMSuite

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/AI4Bharat/IndicLLMSuite)

AI4Bharat / IndicLLMSuite

A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages

☆413

Alternatives and similar repositories for IndicLLMSuite

Users that are interested in IndicLLMSuite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AI4Bharat / setu
View on GitHub
Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Bui…
☆16May 17, 2024Updated 2 years ago
VarunGumma / IndicTransToolkit
View on GitHub
A simple, consistent and extendable toolkit for IndicTrans2. (Pypi: https://pypi.org/project/indictranstoolkit)
☆39Apr 30, 2026Updated 2 months ago
VishnuPJ / MalayaLLM
View on GitHub
A Continually LoRA PreTrained and FineTuned 7B Llama-2 Indic model for Malayalam Language.
☆70Jul 16, 2024Updated 2 years ago
rasbt / datapipes-blog
View on GitHub
Code for the DataPipes article
☆15Jun 14, 2022Updated 4 years ago
TeluguLLMLabs / Indic-gemma-7b-Navarasa
View on GitHub
Repository for fine-tuning gemma models using unsloth for indic languages
☆100Mar 18, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
SylphAI-Inc / LLM-engineer-handbook
View on GitHub
A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
☆4,985Aug 18, 2025Updated 11 months ago
AI4Bharat / IndicInstruct
View on GitHub
Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
☆65Oct 26, 2024Updated last year
RaoAviralYadav / From-Code-to-Creation
View on GitHub
Unlock the exciting world of frontend web development with our hands-on workshop, From Code to Creation. Designed for beginners, this com…
☆14Mar 21, 2025Updated last year
nikhilhuh / Mail-Sending-Demo
View on GitHub
☆23Mar 12, 2026Updated 4 months ago
soumendrak / MTEnglish2Odia
View on GitHub
Machine Translation from English to Odia language.
☆10Aug 9, 2021Updated 4 years ago
kirudang / Automated_Text_Extraction
View on GitHub
☆11Oct 9, 2023Updated 2 years ago
AI4Bharat / indicnlp_catalog
View on GitHub
A collaborative catalog of NLP resources for Indic languages
☆638Dec 14, 2024Updated last year
vliu15 / adversarial-tts
View on GitHub
End-to-end Text-to-Speech with Generative Adversarial Networks
☆20Feb 6, 2021Updated 5 years ago
OpenNyAI / Jugalbandi-Manager
View on GitHub
Jugalbandi (JB) Manager is a full AI-powered conversational chatbot platform. It's platform agnostic and can serve multiple channels such…
☆40Apr 14, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ResorcinolWorks / DeepLearningLab
View on GitHub
This repo contains the document of index and combined doc of solutions
☆18May 25, 2025Updated last year
mrcyb4r / OSINT-TOOLs
View on GitHub
☆21Jun 13, 2026Updated last month
cybertronai / Megatron-LM
View on GitHub
Ongoing research training transformer language models at scale, including: BERT
☆16Apr 25, 2019Updated 7 years ago
kaburelabs / Wine-Project-Dash
View on GitHub
My second web development project using Dash.
☆10Jun 20, 2023Updated 3 years ago
Open-Speech-EkStep / indic-punct
View on GitHub
☆45Dec 15, 2022Updated 3 years ago
sarvamai / llm_wer
View on GitHub
☆25Apr 2, 2026Updated 3 months ago
precog-iiith / LLMWorkshop
View on GitHub
☆30Apr 20, 2024Updated 2 years ago
project-anuvaad / anuvaad-parallel-corpus
View on GitHub
☆24May 5, 2022Updated 4 years ago
smc / payyans-go
View on GitHub
ASCII <-> Unicode conversion library
☆18Apr 1, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
adithya-s-k / indic_eval
View on GitHub
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks
☆40Jun 10, 2024Updated 2 years ago
toheedakhtar / llm-scratch
View on GitHub
building a Large Language Model (LLM) from scratch.
☆36Feb 4, 2025Updated last year
Abonia1 / yolosegment2labelme
View on GitHub
yolosegment2labelme - a Python package that allows you to convert YOLO segmentation prediction results to LabelMe and anylabeling JSON fo…
☆10May 8, 2024Updated 2 years ago
kurianbenoy / Indic-Subtitler
View on GitHub
Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.
☆93Oct 3, 2025Updated 9 months ago
winstxnhdw / CapGen
View on GitHub
A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.
☆11Updated this week
varnamproject / webIME
View on GitHub
A JavaScript Input Method Engine inspired by ibus on GNU/Linux
☆17May 13, 2023Updated 3 years ago
vickysingh009 / jarvis-ai-assistant
View on GitHub
☆66Jun 17, 2025Updated last year
Kenpath / indic-text-normalization
View on GitHub
Text Normalization utilities for normalizing text for TTS
☆26Mar 4, 2026Updated 4 months ago
ankitpathak62 / Jarvis-2025
View on GitHub
To control all function of my system and do all my work after listing my command
☆140Feb 11, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
hiten-codes / Java-Developer-Roadmap
View on GitHub
Best and Free resources to master Java and become a Java Developer in 2025
☆121Dec 21, 2024Updated last year
smtiitm / Fastspeech2_HS
View on GitHub
Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…
☆57Feb 5, 2026Updated 5 months ago
OpenCSGs / Awesome-SLMs
View on GitHub
survery of small language models
☆18Jul 23, 2024Updated 2 years ago
ARTPARK-SAHAI-ORG / calibrate
View on GitHub
Core engine behind Calibrate, a framework for evaluating AI agents: speech-to-text, text-to-speech, LLM evaluation, end-to-end simulation…
☆17Updated this week
viktorbezdek / awesome-github-projects
View on GitHub
Curated list of GitHub projects I starred over the years
☆831Updated this week
weaviate-tutorials / Hurricane
View on GitHub
Writing Blog Posts with Generative Feedback Loops!
☆52Mar 19, 2024Updated 2 years ago
rasbt / litdata
View on GitHub
Streamline data pipelines for AI. Process datasets across 1000s of machines, and optimize data for blazing fast model training.
☆16Sep 18, 2024Updated last year