ayanban011 / GraphKD
[ICDAR 2024] (Best Student Paper🏆) Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
☆12Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for GraphKD
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆23Updated last year
- Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout☆12Updated 6 months ago
- Discovering De-similarities of Modular Structure Between Tumor Cells and Normal Cells by Integrating Multiple Data Sources Through Joint …☆8Updated 2 months ago
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆20Updated 7 months ago
- The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"☆81Updated 2 months ago
- The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"☆17Updated last year
- The official code for “DeepEraser: Deep Iterative Context Mining for Generic Text Eraser”, TMM, 2024.☆28Updated 2 months ago
- (CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.☆50Updated 5 months ago
- arXiv 23 "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs"☆13Updated 9 months ago
- ☆35Updated last year
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆73Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆60Updated 2 months ago
- (ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer☆72Updated 7 months ago
- Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)☆52Updated 5 months ago
- ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting☆20Updated 3 months ago
- RoDLA: Benchmarking the Robustness of Document Layout Analysis Models☆29Updated 7 months ago
- Official implementation of High Fidelity Scene Text Synthesis.☆36Updated this week
- [IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer☆102Updated 7 months ago
- ☆18Updated this week
- ☆22Updated 9 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆53Updated 3 weeks ago
- Official PyTorch Implementation of "WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models" - ICDAR 2023☆68Updated 4 months ago
- Multimodal Semi-Supervised Learning for Text Recognition (SemiMTR)☆81Updated last year
- The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer☆45Updated 5 months ago
- Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆61Updated last month
- The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.☆138Updated last year
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆70Updated 2 months ago
- Turning a CLIP Model into a Scene Text Detector (CVPR2023) | Turning a CLIP Model into a Scene Text Spotter (TPAMI)☆181Updated 5 months ago
- Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.☆13Updated last year
- Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)☆39Updated 5 months ago