CyberAgentAILab / webcolorLinks
Official implementation of Generative Colorization of Structured Mobile Web Pages, WACV 2023.
☆22Updated last year
Alternatives and similar repositories for webcolor
Users that are interested in webcolor are comparing it to the libraries listed below
Sorting:
- [CVPR 2023 highlight] Towards Flexible Multi-modal Document Models☆59Updated last year
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆81Updated 2 years ago
- The official PyTorch implementation for arXiv'23 paper 'LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer'☆100Updated 2 months ago
- [ECCV2022] Mind the Gap in Distilling StyleGANs☆29Updated 2 years ago
- ☆17Updated 2 years ago
- An interactive demo based on Segment-Anything for stroke-based painting which enables human-like painting.☆35Updated 2 years ago
- This is the official repository for CookGAN: Meal Image Synthesis from Ingredients☆23Updated 2 years ago
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Updated 2 years ago
- Source code of the TextLap model, a LLM for text-2-layout generation.☆15Updated 9 months ago
- ☆18Updated 10 months ago
- ☆80Updated 2 years ago
- [CVPR 2024 Oral] Official repository for RALF: Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation☆133Updated last year
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆34Updated last year
- ☆29Updated 2 years ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆19Updated last year
- OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]☆77Updated 4 months ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆55Updated 11 months ago
- Official implementation of OSSGAN [CVPR 2022]☆21Updated 3 years ago
- ☆27Updated 4 years ago
- [NeurIPS 2022: Score-Based Modeling Workshop] Multiresolution Textual Inversion☆99Updated 2 years ago
- High-Resolution Image Synthesis with Latent Diffusion Models☆9Updated 3 years ago
- FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions☆55Updated last year
- Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-L…☆37Updated 3 years ago
- This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally i…☆14Updated 3 years ago
- The official repository of paper "ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection" (N…☆50Updated last year
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"☆52Updated last year
- This repository is associated with the research paper titled ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large…☆12Updated 2 months ago
- Evaluation benchmark for the task of Semantic Image Translation. Contains code to run FlexIT (CVPR 2022)☆34Updated 3 years ago
- Un-*** 50 billions multimodality dataset☆23Updated 2 years ago