omron-sinicx / scipostlayoutLinks
☆20Updated 10 months ago
Alternatives and similar repositories for scipostlayout
Users that are interested in scipostlayout are comparing it to the libraries listed below
Sorting:
- Continuous diffusion for layout generation☆45Updated 4 months ago
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 9 months ago
- [CVPR2025] Official implementation of High Fidelity Scene Text Synthesis.☆65Updated 3 months ago
- [CVPR 2023 highlight] Towards Flexible Multi-modal Document Models☆57Updated last year
- LayoutFlow: Flow Matching for Layout Generation [Andrade Guerreiro et al., ECCV 2024]☆32Updated last month
- A Large-scale Dataset for training and evaluating model's ability on Dense Text Image Generation☆70Updated 4 months ago
- OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]☆73Updated 3 months ago
- Code for ACM MM'23 paper: LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation☆48Updated 10 months ago
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆71Updated last year
- ☆97Updated last year
- Official code for paper: Desigen: A Pipeline for Controllable Design Template Generation [CVPR'24]☆70Updated 11 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆72Updated 11 months ago
- [arXiv: 2505.12307] LogicOCR: Do Your Large Multimodal Models Excel at Logical Reasoning on Text-Rich Images?☆22Updated last month
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆81Updated 2 years ago
- LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer☆45Updated 5 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆40Updated 9 months ago
- AnyTrans: Translate AnyText in the Image with Large Scale Models (EMNLP2024 Findings)☆18Updated 6 months ago
- Source code of the TextLap model, a LLM for text-2-layout generation.☆15Updated 8 months ago
- ☆23Updated 2 months ago
- Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.☆13Updated 2 years ago
- The official code for “DeepEraser: Deep Iterative Context Mining for Generic Text Eraser”, TMM, 2024.☆39Updated 10 months ago
- [ICLR2025] MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆78Updated 9 months ago
- ☆50Updated 6 months ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆77Updated 2 weeks ago
- ☆39Updated last year
- Unified layout planning and image generation☆21Updated 2 months ago
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆45Updated 7 months ago
- ☆25Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Updated last year
- Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis☆85Updated 11 months ago