liyingxuan1012 / zeroshot-speaker-predictionLinks
Official repository of "Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion" (ACMMM 2024)
☆15Updated last year
Alternatives and similar repositories for zeroshot-speaker-prediction
Users that are interested in zeroshot-speaker-prediction are comparing it to the libraries listed below
Sorting:
- Official repository of Manga109Dialog (ICME 2024)☆26Updated last year
- ☆24Updated last year
- Various annotations of Manga109 dataset☆13Updated 8 months ago
- COO: Comic onomatopoeia dataset (ECCV 2022)☆87Updated 10 months ago
- MangaLMM – Try the official demo below☆30Updated 2 months ago
- [CVPR 2024 Oral] Official repository for RALF: Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation☆139Updated last year
- A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text remova…☆267Updated last year
- [NeurIPS2023] This is the official code of the paper "GlyphControl: Glyph Conditional Control for Visual Text Generation"☆239Updated last year
- [ICCV 2025] LayerD: Decomposing Raster Graphic Designs into Layers☆42Updated this week
- ☆136Updated 2 years ago
- Comics Dataset Framework for Comics Understanding☆34Updated 4 months ago
- NDL古典籍OCRのアプリケーション(ソースコードを含む)☆83Updated 2 months ago
- [AAAI2024] FontDiffuser: One-Shot Font Generation via Denoising Diffusion with Multi-Scale Content Aggregation and Style Contrastive Lear…☆477Updated last year
- The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"☆133Updated last year
- Official implementation for ICDAR 2024 Oral paper "ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expressi…☆28Updated last year
- Continuous diffusion for layout generation☆52Updated 10 months ago
- Oracle Bone Script data collected by VLRLab of HUST☆65Updated last year
- ☆46Updated 11 months ago
- Generate a transcript for your favourite Manga: Detect manga characters, text blocks and panels. Order panels. Cluster characters. Match …☆413Updated 6 months ago
- Official repository for CF-Font: Content Fusion for Few-shot Font Generation.☆141Updated 2 years ago
- ☆21Updated last year
- OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]☆84Updated 9 months ago
- [CVPR 2023] LayoutDM: Discrete Diffusion Model for Controllable Layout Generation☆292Updated 2 years ago
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆129Updated last year
- The official project of paper "Visual Text Processing: A Comprehensive Review and Unified Evaluation""☆92Updated 2 months ago
- Official PyTorch implementation of the CVPR 2022 paper: "Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Di…☆94Updated 3 years ago
- [CVPR 2023 highlight] Towards Flexible Multi-modal Document Models☆59Updated 2 years ago
- [2024-NeurIPS] TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control☆97Updated 9 months ago
- This repository is the code of our paper "DiffUTE: Universal Text Editing Diffusion Model" (NeurIPS'2023).☆144Updated 9 months ago
- ☆204Updated last year