BengaliAI / BADLADLinks
BADLAD: Bengali Document Layout Analysis Dataset
☆15Updated last year
Alternatives and similar repositories for BADLAD
Users that are interested in BADLAD are comparing it to the libraries listed below
Sorting:
- Bangla Unicode Normalization☆21Updated last year
- This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batc…☆36Updated last year
- [Computer Speech & Language] A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages☆14Updated last year
- A framework for Arabic spelling correction using different seq2seq model architectures such as transformers and RNNs☆23Updated last year
- Handwritten text recognition using transformers.☆158Updated last year
- A Java toolkit to generate multi fonts Arabic text images☆11Updated 4 years ago
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆76Updated last year
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆135Updated 2 years ago
- Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.☆17Updated last year
- This repository contains the official release of the model "BanglaT5" and associated downstream finetuning code and datasets introduced i…☆85Updated 2 years ago
- ☆63Updated 6 months ago
- Pytorch implementation of our paper: Adapting OCR with Limited Labels☆62Updated 2 years ago
- ☆141Updated last year
- ☆10Updated 7 months ago
- Code-Switched translations with Large Language models☆24Updated last year
- ☆50Updated 3 years ago
- Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset☆50Updated 2 years ago
- Fine-tuning Open-Source LLMs for Adaptive Machine Translation☆90Updated 6 months ago
- Transformer based Bangla Speech Recognition | Encoder Decoder Architecture☆57Updated 2 years ago
- ☆18Updated last year
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆287Updated 2 years ago
- Segmenting text blocks and baselines from documents using deep learning techniques☆13Updated 4 years ago
- Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION☆79Updated 2 years ago
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆57Updated last year
- Open source speech to text models for Indic Languages☆321Updated 3 years ago
- This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…☆30Updated 3 weeks ago
- Aiming to achieve ultimate Multilingual TTS pipeline with main focus on releasing COQUI🐸TTS(Text-to-Speech) based high performing neural…☆42Updated 2 years ago
- ☆41Updated 3 years ago
- Code for BMVC2020 paper "Text and Style Conditioned GAN for Generation of Offline Handwriting Lines"☆74Updated 2 years ago
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆103Updated 5 months ago