A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages
☆393Oct 7, 2024Updated last year
Alternatives and similar repositories for IndicLLMSuite
Users that are interested in IndicLLMSuite are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Bui…☆16May 17, 2024Updated last year
- A Continually LoRA PreTrained and FineTuned 7B Llama-2 Indic model for Malayalam Language.☆66Jul 16, 2024Updated last year
- A simple, consistent and extendable toolkit for IndicTrans2. (Pypi: https://pypi.org/project/indictranstoolkit)☆38Jul 24, 2025Updated 8 months ago
- speak like locals when travelling☆18Dec 28, 2025Updated 2 months ago
- Translation models for 22 scheduled languages of India☆414Oct 3, 2025Updated 5 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists☆31Aug 14, 2025Updated 7 months ago
- Repository for fine-tuning gemma models using unsloth for indic languages☆97Mar 18, 2024Updated 2 years ago
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆64Oct 26, 2024Updated last year
- Generate large textual corpora for almost any language by crawling the web☆13Feb 17, 2024Updated 2 years ago
- End-to-end Text-to-Speech with Generative Adversarial Networks☆20Feb 6, 2021Updated 5 years ago
- A collaborative catalog of NLP resources for Indic languages☆629Dec 14, 2024Updated last year
- ☆11Oct 9, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT☆16Apr 25, 2019Updated 6 years ago
- Keyphrase Extraction from Scholarly Documents - Thesis☆14Nov 3, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Shoonya - Platform to Annotate and label data at scale.☆65Oct 31, 2025Updated 4 months ago
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Dec 23, 2023Updated 2 years ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆39Jun 10, 2024Updated last year
- ☆45Dec 15, 2022Updated 3 years ago
- Open source subtitling platform 💻 for transcribing and translating videos/audios in Indic languages.☆93Oct 3, 2025Updated 5 months ago
- ☆29Apr 20, 2024Updated last year
- A New Tamil Large Language Model (LLM) Based on Llama 2☆325Apr 5, 2024Updated last year
- A fast CPU-first video/audio transcriber for generating caption files with Whisper and CTranslate2, hosted on Hugging Face Spaces.☆11Updated this week
- A JavaScript Input Method Engine inspired by ibus on GNU/Linux☆17May 13, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Bangla PDF to text converter that works on Windows, macOS, and Linux without any extra downloads or configurations.☆22Oct 12, 2024Updated last year
- Source code and dataset for the paper 'Saamayik: A Benchmark and Dataset for English-Sanskrit Translation'☆15Oct 11, 2025Updated 5 months ago
- ☆23May 5, 2022Updated 3 years ago
- survery of small language models☆18Jul 23, 2024Updated last year
- generate video with voice narration from ppt/pdf Slides☆10Sep 4, 2023Updated 2 years ago
- Build Agentic workflows with function calling using open LLMs☆28Mar 2, 2026Updated 3 weeks ago
- Django Legal Advice Builder is a django app that can be used to create, edit and display multi-step questionaires and display the answers…☆12Dec 23, 2022Updated 3 years ago
- Writing Blog Posts with Generative Feedback Loops!☆50Mar 19, 2024Updated 2 years ago
- A library of translation-based text similarity measures☆25Dec 11, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- An LLM enabled XML generator for Indian laws in the LegalDocML and LegalRuleML formats☆20Sep 6, 2024Updated last year
- Exploration of Vector database Index for fast approximate nearest neighbour search.☆37Aug 4, 2024Updated last year
- This repository is describes the Indic NLP resources from L3Cube.☆23Jun 7, 2025Updated 9 months ago
- Collection of numerical methods for high frequency data, in Python notebooks☆13Mar 10, 2021Updated 5 years ago
- ☆11May 24, 2015Updated 10 years ago
- Calibration and Simulation Engine for Local Volatility Models☆15Dec 13, 2021Updated 4 years ago
- Light WebUI for lm.rs☆24Oct 14, 2024Updated last year