emanuelevivoli / awesome-comics-understanding
The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"
☆111Updated 4 months ago
Alternatives and similar repositories for awesome-comics-understanding
Users that are interested in awesome-comics-understanding are comparing it to the libraries listed below
Sorting:
- Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout☆20Updated 3 months ago
- [ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"☆76Updated 11 months ago
- ☆23Updated 2 months ago
- Official Implementation for "MyVLM: Personalizing VLMs for User-Specific Queries" (ECCV 2024)☆169Updated 10 months ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆85Updated 3 months ago
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆66Updated 9 months ago
- Official PyTorch Implementation of "DiffusionPen: Towards Controlling the Style of Handwritten Text Generation" - ECCV 2024☆48Updated 6 months ago
- [CVPR 2024] Official repo for "InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model".☆120Updated 3 months ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆24Updated last year
- [ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction☆65Updated 9 months ago
- ☆109Updated 3 months ago
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆87Updated 5 months ago
- MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance (ACM MM2024)☆131Updated last month
- Official code for paper: Desigen: A Pipeline for Controllable Design Template Generation [CVPR'24]☆69Updated 9 months ago
- 👀 Visual Instruction Inversion: Image Editing via Visual Prompting (NeurIPS 2023)☆90Updated last year
- TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering☆160Updated last year
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆55Updated last year
- ☆95Updated last year
- [ECCV 2024] Official repo for UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diff…☆224Updated 3 months ago
- [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis☆67Updated 3 months ago
- Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing (NeurIPS 2023)☆103Updated last year
- ConceptAttention: A method for interpreting multi-modal diffusion transformers.☆250Updated last month
- ReCo: Region-Controlled Text-to-Image Generation, CVPR 2023☆126Updated last year
- [CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalization☆103Updated last year
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆119Updated last year
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆147Updated 11 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆58Updated last year
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆173Updated last year
- [WACV 2024] Training-Free Layout Control with Cross-Attention Guidance☆257Updated last year
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆40Updated 3 weeks ago