BengaliAI / BADLADLinks
BADLAD: Bengali Document Layout Analysis Dataset
☆14Updated last year
Alternatives and similar repositories for BADLAD
Users that are interested in BADLAD are comparing it to the libraries listed below
Sorting:
- Bangla Unicode Normalization☆21Updated last year
- This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batc…☆35Updated last year
- [Computer Speech & Language] A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages☆14Updated last year
- Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understan…☆356Updated 3 years ago
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆76Updated last year
- This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…☆28Updated last year
- Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance☆21Updated last year
- Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task…☆287Updated 2 years ago
- OCR Annotations from Amazon Textract for Industry Documents Library☆103Updated 3 years ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Updated 2 years ago
- This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Da…☆152Updated last year
- Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset☆50Updated 2 years ago
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆133Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆395Updated 2 years ago
- ☆141Updated last year
- Context-Sensitive Neural Spelling Checker☆20Updated last year
- ☆15Updated 7 years ago
- ☆160Updated 2 years ago
- ☆18Updated last year
- A framework for Arabic spelling correction using different seq2seq model architectures such as transformers and RNNs☆23Updated last year
- BNLP is a natural language processing toolkit for Bengali Language.☆306Updated 11 months ago
- This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 4…☆275Updated last year
- Segmenting text blocks and baselines from documents using deep learning techniques☆14Updated 4 years ago
- Transformer based Bangla Speech Recognition | Encoder Decoder Architecture☆54Updated 2 years ago
- [ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)☆42Updated 2 years ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆28Updated 2 years ago
- Official repository accompaying the ICDAR 2023 paper☆12Updated 2 years ago
- ☆45Updated 3 years ago
- Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION☆79Updated 2 years ago
- Unofficial implementation of the paper "Full Page Handwriting Recognition via Image to Sequence Extraction" by Singh et al. (2021).☆53Updated 3 years ago