Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME
☆109Apr 6, 2025Updated 11 months ago
Alternatives and similar repositories for IndicBERT
Users that are interested in IndicBERT are comparing it to the libraries listed below
Sorting:
- A collaborative catalog of NLP resources for Indic languages☆627Dec 14, 2024Updated last year
- Translation models for 22 scheduled languages of India☆407Oct 3, 2025Updated 5 months ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆110Aug 28, 2025Updated 6 months ago
- This repository contains the HiNER dataset released with our paper at LREC 2022☆16Jun 6, 2023Updated 2 years ago
- We want to build open-source solutions and standards for using AI to solve mental health challenges. The goal is to apply DPI knowledge a…☆27Jun 13, 2025Updated 8 months ago
- ☆12Mar 31, 2020Updated 5 years ago
- Python version of the SymSpell Compound algorithm☆12Sep 18, 2018Updated 7 years ago
- 🗺️ OpenStreetMap Countries GeoJSON — updated daily!☆17Aug 17, 2025Updated 6 months ago
- A modular framework for neural networks with Euclidean symmetry☆11Jul 1, 2024Updated last year
- make logging fun again☆19Apr 9, 2017Updated 8 years ago
- Code for extracting parallel corpora from pmindia☆17Jan 28, 2020Updated 6 years ago
- ☆18Feb 19, 2026Updated 2 weeks ago
- A collection of Bangla newspaper and blog crawlers. Can be used to mine bangla text data for Natural Language Processing tasks.☆18Jan 30, 2023Updated 3 years ago
- A open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of dom…☆23May 27, 2024Updated last year
- We have used 250 sentences of movie reviews available for research from IIT bombay and also crawled and manually annotated 750 reviews fr…☆20Jun 28, 2018Updated 7 years ago
- A rule-based lemmatizer for Bengali / Bangla based written in Python. Under active development.☆25Dec 28, 2019Updated 6 years ago
- Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"☆64Oct 26, 2024Updated last year
- A repository to perform self-instruct with a model on HF Hub☆32Sep 29, 2023Updated 2 years ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Feb 15, 2023Updated 3 years ago
- Yet Another Neural Machine Translation Toolkit☆179Mar 7, 2025Updated last year
- GPI-Space: Memory Driven Computing and Big Data☆10Jan 2, 2025Updated last year
- Write tweets with AI Agents (CrewAI) and LLMs (Llama 3, GPT-4o)☆30Jun 1, 2024Updated last year
- Resources and tools for Indian language Natural Language Processing☆629Jun 7, 2024Updated last year
- ☆31Aug 9, 2022Updated 3 years ago
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆136Jan 2, 2024Updated 2 years ago
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆205May 27, 2020Updated 5 years ago
- Resources to go with the Indic NLP Library☆78Jun 12, 2022Updated 3 years ago
- pix2pix and Cycle GAN architectures for image style transfer☆13May 27, 2021Updated 4 years ago
- ☆10May 28, 2025Updated 9 months ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- Demo repository showcasing how to use reusable workflows to build artifact attestations☆14Feb 16, 2026Updated 3 weeks ago
- Reinforcement learning modular with pytorch☆11Jan 18, 2021Updated 5 years ago
- Repository to store Sanskrit koshas and scripts to process them.☆37Aug 22, 2025Updated 6 months ago
- ☆16Jan 16, 2023Updated 3 years ago
- Self-evaluating RAG application on LangCheck docs☆11Sep 10, 2025Updated 5 months ago
- Official implementation of the paper "Deep Learning for Hate Speech Detection -A Comparative Study"☆39Dec 10, 2021Updated 4 years ago
- ipython notebooks for feature extraction and training of audio event classifier on ESC-50 dataset.☆10Mar 16, 2018Updated 7 years ago
- A selection of test cases used to test accessibility and Section 508 compliance of mobile applications☆12Apr 1, 2015Updated 10 years ago
- Cross-lingual Fact-to-Text Alignment and Generation for Low-Resource Languages☆11Jan 1, 2023Updated 3 years ago