salesforce / BannerGen
☆26Updated last month
Related projects ⓘ
Alternatives and complementary repositories for BannerGen
- The official PyTorch implementation for arXiv'23 paper 'LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer'☆69Updated last year
- Official implementation of Generative Colorization of Structured Mobile Web Pages, WACV 2023.☆21Updated 11 months ago
- This is the official repository for the paper "OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data". …☆57Updated 5 months ago
- Implementation of Conditional ViT on LAION — Referred Visual Search — Fashion☆38Updated 2 months ago
- Load any clip model with a standardized interface☆21Updated 6 months ago
- Iterable datapipelines for pytorch training.☆81Updated 2 months ago
- Official Repo of Graphist☆97Updated 6 months ago
- Recaption large (Web)Datasets with vllm and save the artifacts.☆30Updated last month
- Towards Flexible Multi-modal Document Models [Inoue+, CVPR2023]☆55Updated last year
- OpenCOLE: Towards Reproducible Automatic Graphic Design Generation [Inoue+, CVPRW2024 (GDUG)]☆46Updated last week
- ☆59Updated last year
- Aggregating embeddings over time☆31Updated last year
- Video-LlaVA fine-tune for CinePile evaluation☆38Updated 3 months ago
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆80Updated last year
- M4 experiment logbook☆56Updated last year
- ☆64Updated last year
- Fast Sprite Decomposition from Animated Graphics [ECCV2024]☆26Updated last month
- ☆100Updated 9 months ago
- The largest multilingual image-text classification dataset. It contains fashion products.☆69Updated last year
- Guide diffusion on ImageBind embedding similarity☆27Updated last year
- ☆24Updated last week
- Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image …☆51Updated 3 weeks ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆80Updated 3 months ago
- This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)☆79Updated 7 months ago
- ☆71Updated last year
- ☆78Updated 10 months ago
- A huge dataset for Document Visual Question Answering☆13Updated 3 months ago
- A Gradio component that can be used to annotate images with bounding boxes.☆31Updated last week
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆32Updated 7 months ago