emanuelevivoli / CoMixLinks
Comics Dataset Framework for Comics Understanding
☆23Updated 4 months ago
Alternatives and similar repositories for CoMix
Users that are interested in CoMix are comparing it to the libraries listed below
Sorting:
- [ECCV-W] Official repo for the paper "ComiCap: A VLMs pipeline for dense captioning of Comic Panels"☆12Updated 7 months ago
- The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"☆118Updated 6 months ago
- [ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"☆81Updated last year
- [CVPR 2024] Official PyTorch implementation of "ECLIPSE: Revisiting the Text-to-Image Prior for Efficient Image Generation"☆64Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 10 months ago
- A tool for benchmarking image generation models.☆33Updated 2 years ago
- Implementation for "Correcting Diffusion Generation through Resampling" [CVPR 2024]☆33Updated last year
- ☆96Updated 11 months ago
- [NeurIPS 2022: Score-Based Modeling Workshop] Multiresolution Textual Inversion☆99Updated 2 years ago
- Code and Dataset for FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context.☆20Updated 2 years ago
- Gradient-Free Textual Inversion for Personalized Text-to-Image Generation☆43Updated 2 years ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆85Updated 5 months ago
- ☆21Updated 2 years ago
- Official codebase for Margin-aware Preference Optimization for Aligning Diffusion Models without Reference (MaPO).☆78Updated last year
- A curated list of papers and resources for text-to-image evaluation.☆29Updated last year
- [CVPR 2023] SketchXAI: A First Look at Explainability for Human Sketches☆24Updated last year
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆81Updated 2 years ago
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆90Updated 7 months ago
- Diffusion attentive attribution maps for interpreting Stable Diffusion for image-to-image attention.☆55Updated 6 months ago
- 🤗 Unofficial huggingface/diffusers-based implementation of the paper "Training-Free Structured Diffusion Guidance for Compositional Text…☆120Updated 2 years ago
- Official code for paper: Desigen: A Pipeline for Controllable Design Template Generation [CVPR'24]☆70Updated 11 months ago
- Code for the paper "Manipulating Embeddings of Stable Diffusion Prompts".☆14Updated 11 months ago
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆86Updated 2 years ago
- DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models☆46Updated last year
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆34Updated last year
- This repository provides utilities to a minimal dataset for InstructPix2Pix like training for Diffusion models.☆47Updated 2 years ago
- 🤗 Unofficial huggingface/diffusers-based implementation of the paper "Training-Free Layout Control with Cross-Attention Guidance".☆42Updated 2 years ago
- Implementation of MDP: A Generalized Framework for Text-Guided Image Editing by Manipulating the Diffusion Path☆69Updated 2 years ago
- Official code implementation for our paper -- Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models.☆25Updated 2 years ago
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆55Updated last year