CyberAgentAILab / flex-dm
Towards Flexible Multi-modal Document Models [Inoue+, CVPR2023]
☆55Updated last year
Related projects: ⓘ
- Implementation of CanvasVAE: Learning to Generate Vector Graphic Documents, ICCV 2021☆61Updated last year
- Cheng-Fu Yang*, Wan-Cyuan Fan*, Fu-En Yang, Yu-Chiang Frank Wang, "LayoutTransformer: Scene Layout Generation with Conceptual and Spatial…☆58Updated 2 years ago
- The official PyTorch implementation for arXiv'23 paper 'LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer'☆68Updated 11 months ago
- OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]☆41Updated last week
- Continuous diffusion for layout generation☆26Updated 5 months ago
- ☆77Updated last year
- Official implementation of Generative Colorization of Structured Mobile Web Pages, WACV 2023.☆21Updated 9 months ago
- Official code for paper: Desigen: A Pipeline for Controllable Design Template Generation [CVPR'24]☆54Updated 2 months ago
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆72Updated last year
- ☆34Updated last month
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆58Updated last week
- Official implementation of High Fidelity Scene Text Synthesis.☆33Updated 3 weeks ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆73Updated last month
- ☆78Updated 8 months ago
- ☆37Updated 3 weeks ago
- ☆32Updated 8 months ago
- BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild☆24Updated 5 months ago
- ☆119Updated 7 months ago
- FuseCap: Large Language Model for Visual Data Fusion in Enriched Caption Generation☆47Updated 5 months ago
- The official codes and datasets for Artistic Text Segmentation (ECCV 2024).☆16Updated 2 months ago
- (wip) Use LAION-AI's CLIP "conditoned prior" to generate CLIP image embeds from CLIP text embeds.☆27Updated 2 years ago
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆81Updated last year
- Code for "DreamEdit: Subject-driven Image Editing" (TMLR2023)☆105Updated 7 months ago
- Diffusion Layout Transformer implementation.☆48Updated last year
- Official repository for "PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout" (CVPR 2023).☆115Updated 2 months ago
- Code Release for the paper "Make-A-Story: Visual Memory Conditioned Consistent Story Generation" in CVPR 2023☆37Updated last year
- ☆65Updated last year
- Official code implementation for our paper -- Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models.☆25Updated last year
- T2VScore: Towards A Better Metric for Text-to-Video Generation☆76Updated 5 months ago