emanuelevivoli / awesome-comics-understanding
The official repo of the Comics Survey: "A missing piece in Vision and Language: A Survey on Comics Understanding"
☆38Updated this week
Related projects: ⓘ
- [ECCV 2024] - Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation☆39Updated last week
- ☆15Updated 7 months ago
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆47Updated last month
- [ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"☆65Updated 4 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆143Updated 4 months ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆54Updated last year
- [CVPR 2024] Official repo for "InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model".☆91Updated 2 months ago
- Official repository of paper titled "How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs".☆39Updated 3 weeks ago
- [ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"☆37Updated 2 months ago
- Beyond Single Object Text-to-SVG Synthesis with Comprehensive Canvas Layout☆11Updated 4 months ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆23Updated last year
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆98Updated last month
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆85Updated 2 weeks ago
- 👀 Visual Instruction Inversion: Image Editing via Visual Prompting (NeurIPS 2023)☆82Updated 9 months ago
- Densely Captioned Images (DCI) dataset repository.☆155Updated 2 months ago
- Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024☆34Updated last month
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆73Updated last month
- ☆45Updated 2 months ago
- Composed Video Retrieval☆42Updated 4 months ago
- Code base of SynthCLIP: CLIP training with purely synthetic text-image pairs from LLMs and TTIs.☆84Updated 5 months ago
- Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)☆97Updated 5 months ago
- [CVPR 2024 Best paper award candidate] EGTR: Extracting Graph from Transformer for Scene Graph Generation☆55Updated 2 months ago
- Official repository of paper "Subobject-level Image Tokenization"☆58Updated 4 months ago
- 🌋👵🏻 Yo'LLaVA: Your Personalized Language and Vision Assistant☆47Updated last week
- ☆89Updated 4 months ago
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆156Updated last year
- ☆71Updated 9 months ago
- Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation. CVPR 2023☆51Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆58Updated 2 weeks ago
- ☆50Updated 2 years ago