[CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
☆32May 12, 2025Updated 10 months ago
Alternatives and similar repositories for FG-CLIP
Users that are interested in FG-CLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detec…☆67Apr 4, 2025Updated 11 months ago
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆86Aug 6, 2025Updated 7 months ago
- [ICCV 2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabular…☆180Nov 10, 2025Updated 4 months ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆33Nov 1, 2025Updated 4 months ago
- This repository is the official implementation of our paper Robust Diffusion Model-Generated Image Detection with CLIP, accepted by MIPR …☆10Jun 13, 2024Updated last year
- code for FineLIP☆40Nov 25, 2025Updated 3 months ago
- [ECCVW/TWYN 2024 - Best Workshop Paper] Are CLIP features all you need for Universal Synthetic Image Origin Attribution?☆12Feb 1, 2025Updated last year
- Repository for evaluating Pegasus-1 and video-language foundation models☆14Nov 12, 2024Updated last year
- An Evaluation Framework for Temporal Information Extraction Systems☆20Feb 19, 2026Updated last month
- This repository contains the code for our CVPR 2024 paper,☆15Aug 27, 2024Updated last year
- A large scale dataset for Video Captioning in Italian☆13May 16, 2023Updated 2 years ago
- A vision-language model with an improved cross-attention mechanism for scalable streaming inference☆28Mar 9, 2026Updated 2 weeks ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆21Oct 14, 2025Updated 5 months ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆22May 18, 2025Updated 10 months ago
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 7 months ago
- Container Specification, Tutorials and Examples for the AI4EU Experiments docker/grpc format for models☆13Jul 10, 2022Updated 3 years ago
- [COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"☆12Jul 15, 2024Updated last year
- Learning to Count without Annotations☆23May 24, 2024Updated last year
- [ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction☆201Feb 5, 2024Updated 2 years ago
- ☆17Oct 22, 2024Updated last year
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆68Dec 8, 2025Updated 3 months ago
- ☆13Apr 9, 2024Updated last year
- ECCV2024: Adversarial Prompt Tuning for Vision-Language Models☆31Mar 7, 2026Updated 2 weeks ago
- A simple Computer Vision Framework, mainly based on PyTorch. Including distributed training, logging and so on.☆12Dec 2, 2023Updated 2 years ago
- ☆16Sep 6, 2024Updated last year
- [MICCAI 2023] (early accept) UOD: universal oneshot detection of anatomical landmarks. https://arxiv.org/abs/2306.07615☆12Jan 4, 2024Updated 2 years ago
- [NeurIPS 2025] The official code for "IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation"☆22Jun 5, 2025Updated 9 months ago
- Composed Video Retrieval☆63May 2, 2024Updated last year
- Examples of Verbalized Machine Learning (VML)☆16Mar 16, 2025Updated last year
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆62Nov 30, 2025Updated 3 months ago
- [TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆33Jun 3, 2025Updated 9 months ago
- [CVPR2025] Official implementation of RAM☆29Nov 4, 2025Updated 4 months ago
- Multiresolution Learning-based Hybrid Transformer-CNN Model for Anatomical Landmark Detection☆12Nov 5, 2023Updated 2 years ago
- A PyTorch implementation of NormSoftmax based on BMVC 2019 paper "Classification is a Strong Baseline for Deep Metric Learning"☆10Mar 15, 2020Updated 6 years ago
- ☆22Nov 25, 2025Updated 3 months ago
- [ECCV2024]The official implementation of the DiffPNG paper in PyTorch.☆17Oct 17, 2024Updated last year
- Repo of NeurIPS23☆18Oct 25, 2023Updated 2 years ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year