BengaliAI / BADLADLinks
BADLAD: Bengali Document Layout Analysis Dataset
☆13Updated last year
Alternatives and similar repositories for BADLAD
Users that are interested in BADLAD are comparing it to the libraries listed below
Sorting:
- This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batc…☆33Updated last year
- Bangla Unicode Normalization☆20Updated last year
- [Computer Speech & Language] A transformer-based spelling error correction framework for Bangla and resource scarce Indic languages☆13Updated last year
- A Java toolkit to generate multi fonts Arabic text images☆11Updated 4 years ago
- Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.☆17Updated last year
- ☆47Updated 2 years ago
- This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Da…☆149Updated 10 months ago
- Pytorch implementation for paper 'BANNER: A Cost-Sensitive Contextualized Model for Bangla Named Entity Recognition'☆13Updated 5 years ago
- [ICCV 2023] ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules☆23Updated last year
- ☆54Updated 2 months ago
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆24Updated 4 years ago
- This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for…☆52Updated last year
- ☆139Updated last year
- This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 4…☆275Updated last year
- Automatic Context Sensitive Spelling Correction for Bangla Text Using Bert and Levenstein Distance☆21Updated 10 months ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year
- This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…☆28Updated 10 months ago
- indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2☆130Updated last year
- Transcribing audio files using Hugging Face's implementation of Wav2Vec2 + "chain-linking" NLP tasks to combine speech-to-text with downs…☆32Updated 4 years ago
- ☆42Updated 2 years ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Updated 2 years ago
- SemEval 2024 Task 1 : Textual Semantic Relatedness☆26Updated last year
- An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…☆12Updated 3 years ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 4 years ago
- NTREX -- News Test References for MT Evaluation☆85Updated last year
- Aiming to achieve ultimate Multilingual TTS pipeline with main focus on releasing COQUI🐸TTS(Text-to-Speech) based high performing neural…☆43Updated 2 years ago
- Arabic cleaning, normalization and segmentation library.☆71Updated last year
- Solutions provided to Chip Huyen's Machine Learning Interview Book with GPT☆41Updated last year
- Several deep learning models for restoring Arabic diacritics using Pytorch.☆35Updated 3 years ago
- A PyPI package for fast word/character error rate (WER/CER) calculation☆72Updated 2 years ago