CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
☆223Dec 16, 2022Updated 3 years ago
Alternatives and similar repositories for FT-CLIP
Users that are interested in FT-CLIP are comparing it to the libraries listed below
Sorting:
- code release of research paper "Exploring Long-Sequence Masked Autoencoders"☆100Oct 14, 2022Updated 3 years ago
- Robust fine-tuning of zero-shot models☆760Apr 29, 2022Updated 3 years ago
- ☆576Jul 19, 2022Updated 3 years ago
- ☆666Nov 28, 2023Updated 2 years ago
- [CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners☆381Jun 1, 2023Updated 2 years ago
- ECCV2022,Bootstrapped Masked Autoencoders for Vision BERT Pretraining☆97Nov 2, 2022Updated 3 years ago
- [NeurIPS 2023] Text data, code and pre-trained models for paper "Improving CLIP Training with Language Rewrites"☆288Jan 14, 2024Updated 2 years ago
- This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".☆1,029Sep 29, 2022Updated 3 years ago
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Jan 18, 2023Updated 3 years ago
- ☆199May 10, 2023Updated 2 years ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,652Aug 1, 2024Updated last year
- [NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"☆319Jun 3, 2024Updated last year
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm☆677Sep 19, 2022Updated 3 years ago
- Paper List for In-context Learning 🌷☆19Jan 3, 2023Updated 3 years ago
- Exploring Visual Prompts for Adapting Large-Scale Models☆289Jun 6, 2022Updated 3 years ago
- [ICCV 2023] You Only Look at One Partial Sequence☆343Oct 21, 2023Updated 2 years ago
- Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"☆259May 3, 2024Updated last year
- Code release for SLIP Self-supervision meets Language-Image Pre-training☆787Feb 9, 2023Updated 3 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 3 months ago
- [CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale☆213Feb 27, 2024Updated 2 years ago
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆53Jun 12, 2023Updated 2 years ago
- [CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)☆44Sep 5, 2023Updated 2 years ago
- Cross-modal few-shot adaptation with CLIP☆352Apr 29, 2025Updated 10 months ago
- An open source implementation of CLIP.☆13,528Mar 12, 2026Updated last week
- Official PyTorch implementation of "Extract Free Dense Labels from CLIP" (ECCV 22 Oral)☆468Sep 19, 2022Updated 3 years ago
- [NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"☆377Sep 16, 2022Updated 3 years ago
- BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training☆400Oct 23, 2024Updated last year
- ConvMAE: Masked Convolution Meets Masked Autoencoders☆523Mar 14, 2023Updated 3 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆296Mar 13, 2024Updated 2 years ago
- Replication of Pix2Seq with Pretrained Model☆59Nov 6, 2021Updated 4 years ago
- ☆76Sep 30, 2022Updated 3 years ago
- An official PyTorch implementation for CLIPPR☆30Jul 22, 2023Updated 2 years ago
- Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)☆2,184May 20, 2024Updated last year
- SVIT: Scaling up Visual Instruction Tuning☆166Jun 20, 2024Updated last year
- PyTorch implementation of ICML 2023 paper "SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation"☆101Jun 28, 2023Updated 2 years ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- [CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language☆1,341Oct 5, 2023Updated 2 years ago
- Grounded Language-Image Pre-training☆2,580Jan 24, 2024Updated 2 years ago
- [CVPR 2022] Official code for "Unified Contrastive Learning in Image-Text-Label Space"☆408Nov 10, 2023Updated 2 years ago