damian0815 / finetune-clip-huggingfaceLinks
Finetuning CLIP on a small image/text dataset using huggingface libs
☆49Updated 2 years ago
Alternatives and similar repositories for finetune-clip-huggingface
Users that are interested in finetune-clip-huggingface are comparing it to the libraries listed below
Sorting:
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆81Updated 2 years ago
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆181Updated last year
- Fine tuning OpenAI's CLIP model on Indian Fashion Dataset☆50Updated 2 years ago
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆172Updated 2 months ago
- ☆92Updated 2 years ago
- Image Editing Anything☆116Updated 2 years ago
- The official PyTorch implementation for arXiv'23 paper 'LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer'☆100Updated 3 months ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆86Updated 6 months ago
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆153Updated last year
- Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).☆342Updated 3 years ago
- ☆125Updated last year
- The benchmark of SOTA text-to-image diffusion models with a new benchmarking strategy based on MiniGPT-4, namely X-IQE.☆124Updated 2 years ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆135Updated last year
- [CVPR2023] A faster, smaller, and better text-to-image model for large-scale training☆242Updated last year
- ECCV2024_Parrot Captions Teach CLIP to Spot Text☆66Updated 11 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆185Updated 3 weeks ago
- ALIGN trained on COYO-dataset☆29Updated last year
- [ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion☆269Updated 9 months ago
- Densely Captioned Images (DCI) dataset repository.☆189Updated last year
- This is a public repository for Image Clustering Conditioned on Text Criteria (IC|TC)☆91Updated last year
- Official Implementations "StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing" (CVMJ2024)☆78Updated last year
- Dataset Diffusion: Diffusion-based Synthetic Data Generation for Pixel-Level Semantic Segmentation (NeurIPS2023)☆124Updated 11 months ago
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆79Updated last year
- CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.☆39Updated 2 weeks ago
- Text-To-Image Generation with Chinese Characters☆130Updated 2 years ago
- Data release for the ImageInWords (IIW) paper.☆217Updated 9 months ago
- ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities…☆120Updated last year
- FInetuning CLIP for Few Shot Learning☆45Updated 3 years ago
- [AAAI 2023] Painterly image harmonization in both spatial domain and frequency domain.☆55Updated 3 months ago
- [CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"☆208Updated last year