damian0815 / finetune-clip-huggingfaceLinks
Finetuning CLIP on a small image/text dataset using huggingface libs
☆52Updated 2 years ago
Alternatives and similar repositories for finetune-clip-huggingface
Users that are interested in finetune-clip-huggingface are comparing it to the libraries listed below
Sorting:
- [ACM TOMM 2023] - Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features☆188Updated 2 years ago
- PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)☆246Updated 5 months ago
- Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)☆179Updated 5 months ago
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆82Updated 2 years ago
- Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).☆367Updated 3 years ago
- CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts☆158Updated last year
- Fine tuning OpenAI's CLIP model on Indian Fashion Dataset☆52Updated 2 years ago
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"☆145Updated 3 weeks ago
- Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic☆279Updated 3 years ago
- CLIPScore EMNLP code☆239Updated 2 years ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆287Updated last year
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆28Updated 2 years ago
- Democratization of "PaLI: A Jointly-Scaled Multilingual Language-Image Model"☆91Updated last year
- Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.☆223Updated 4 years ago
- Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)☆87Updated 9 months ago
- [IEEE TPAMI] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation☆326Updated 5 months ago
- [ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion☆195Updated 3 months ago
- [NeurIPS2023] This is the official code of the paper "GlyphControl: Glyph Conditional Control for Visual Text Generation"☆237Updated last year
- Davidsonian Scene Graph (DSG) for Text-to-Image Evaluation (ICLR 2024)☆100Updated 11 months ago
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆136Updated last year
- Image Editing Anything☆116Updated 2 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆201Updated last year
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆84Updated last year
- Generate text captions for images from their embeddings.☆116Updated 2 years ago
- CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.☆41Updated 3 months ago
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆133Updated 2 years ago
- Better Aligning Text-to-Image Models with Human Preference. ICCV 2023☆290Updated 2 years ago
- Densely Captioned Images (DCI) dataset repository.☆192Updated last year
- [CVPR 2023 (Highlight)] FAME-ViL: Multi-Tasking V+L Model for Heterogeneous Fashion Tasks☆55Updated 2 years ago
- Code and Models for "GeneCIS A Benchmark for General Conditional Image Similarity"☆61Updated 2 years ago