sagorbrur / bangla-corpusLinks
A curated list of Bangla NLP Corpus
☆14Updated last year
Alternatives and similar repositories for bangla-corpus
Users that are interested in bangla-corpus are comparing it to the libraries listed below
Sorting:
- State of the Art Language models and Classifier for Bengali, which is primarily spoken by the Bengalis in South Asia.☆32Updated 4 years ago
- This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batc…☆35Updated last year
- Bangla-Bert is a pretrained bert model for Bengali language☆79Updated 2 months ago
- Bengali transformer using transformers☆21Updated 2 months ago
- Bangla news classification and generation☆22Updated 4 years ago
- Description Describes the IndicNLP corpus and associated datasets☆173Updated 2 years ago
- Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance☆21Updated 7 months ago
- State of the Art Language models and Classifier for Hindi language (spoken in Indian sub-continent)☆123Updated 4 years ago
- Resources and Tool for Bangla language computation☆14Updated 2 years ago
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆127Updated last year
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆195Updated 5 years ago
- Text Summarization for Research Papers☆75Updated 2 years ago
- Line and Word Segmentation for Bangla Handwritten Text Recognition☆15Updated last year
- A collection of Bangla newspaper and blog crawlers. Can be used to mine bangla text data for Natural Language Processing tasks.☆17Updated 2 years ago
- Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.☆72Updated 10 months ago
- Translation models for 22 scheduled languages of India☆327Updated last month
- Data Inspector is an open-source python library that brings 15++ types of different functions to make EDA, data cleaning easier.☆39Updated 2 months ago
- ☆48Updated 5 years ago
- Awesome Bangla Datasets☆19Updated 3 months ago
- Get vaccine availability in India☆25Updated 4 years ago
- Transliteration models for 21 Indic languages☆92Updated last year
- An example of multilingual machine translation using a pretrained version of mt5 from Hugging Face.☆42Updated 4 years ago
- A collection of paper implementations using the PyTorch framework☆28Updated 4 years ago
- Into the depths of some concepts of Artificial Intelligence and Machine Learning☆10Updated 2 weeks ago
- Hindi NLP work☆14Updated 3 years ago
- A Continually LoRA PreTrained and FineTuned 7B Llama-2 Indic model for Malayalam Language.☆61Updated 11 months ago
- Bangla Emotion Detection☆10Updated 4 years ago
- Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.c…☆286Updated 2 years ago
- Tensorflow Certification Exam: Practice☆50Updated 4 years ago
- Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME☆97Updated 2 months ago