damian0815 / finetune-clip-huggingface
Finetuning CLIP on a small image/text dataset using huggingface libs
☆48Updated 2 years ago
Alternatives and similar repositories for finetune-clip-huggingface
Users that are interested in finetune-clip-huggingface are comparing it to the libraries listed below
Sorting:
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆81Updated 2 years ago
- [ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion☆266Updated 6 months ago
- Dreambooth (LoRA) with well-organized code structure. Naive adaptation from 🤗Diffusers.☆13Updated last year
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆177Updated last year
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆39Updated 7 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆173Updated last year
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆70Updated 10 months ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆130Updated last week
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆85Updated 3 months ago
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆90Updated last year
- Precision Search through Multi-Style Inputs☆69Updated 3 weeks ago
- A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or v…☆36Updated last year
- CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.☆36Updated 11 months ago
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆38Updated 5 months ago
- Official Implementations "StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing" (CVMJ2024)☆73Updated 9 months ago
- This repository is the code of our paper "DiffUTE: Universal Text Editing Diffusion Model" (NeurIPS'2023).☆131Updated last month
- CVPR2023 paper☆51Updated last year
- Code Implementation of "Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model"☆116Updated 2 months ago
- [NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT☆136Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 8 months ago
- CLIPScore EMNLP code☆222Updated 2 years ago
- [CVPR 2023 Highlight] Freestyle Layout-to-Image Synthesis☆153Updated 2 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆58Updated last year
- Official code for CVPR 2024 paper: Discriminative Probing and Tuning for Text-to-Image Generation☆32Updated last month
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆24Updated last year
- Code for Shifted Diffusion for Text-to-image Generation (CVPR 2023)☆162Updated last year
- ☆91Updated last year
- Code for Learning Subject-Aware Cropping by Outpainting Professional Photos☆18Updated last year
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆130Updated last year
- ☆97Updated last year