[CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
☆32May 12, 2025Updated last year
Alternatives and similar repositories for FG-CLIP
Users that are interested in FG-CLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detec…☆67Apr 4, 2025Updated last year
- [ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition☆17Jul 9, 2024Updated last year
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆87Aug 6, 2025Updated 9 months ago
- [ICCV 2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabular…☆186Nov 10, 2025Updated 6 months ago
- [ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression☆47Aug 7, 2025Updated 9 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official Code for "A Likelihood Ratio-Based Approach to Segmenting Unknown Objects" [IJCV 2025]☆15Jun 9, 2025Updated 11 months ago
- Relational Content-Based Image Retrieval (R-CBIR) - Retrieving images with given relationships among objects☆17Oct 12, 2021Updated 4 years ago
- LiSu: A Dataset and Method for LiDAR Surface Normal Estimation☆22Nov 30, 2025Updated 5 months ago
- Resources for our AAAI 2022 paper: "Unsupervised Editing for Counterfactual Stories".☆12Oct 25, 2022Updated 3 years ago
- [ECCVW/TWYN 2024 - Best Workshop Paper] Are CLIP features all you need for Universal Synthetic Image Origin Attribution?☆13Mar 27, 2026Updated last month
- code for FineLIP☆40Nov 25, 2025Updated 5 months ago
- This repository contains the code for our CVPR 2024 paper,☆15Aug 27, 2024Updated last year
- Pytorch implementation for DA-VPT (CVPR2025)☆19Dec 15, 2025Updated 5 months ago
- A vision-language model with an improved cross-attention mechanism for scalable streaming inference☆29Mar 9, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆23May 18, 2025Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- Official repository for ODQA experiments from Decomposed Prompting: A Modular Approach for Solving Complex Tasks, ICLR23☆12Jul 28, 2023Updated 2 years ago
- [COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"☆12Jul 15, 2024Updated last year
- Learning to Count without Annotations☆23May 24, 2024Updated 2 years ago
- [ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction☆204Feb 5, 2024Updated 2 years ago
- Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2…☆38Jan 30, 2026Updated 3 months ago
- Multimodal RAG using LlamaIndex, Qdrant, llama.cpp for document QA with local VisonLLM and embedding models☆18Nov 8, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆17Oct 22, 2024Updated last year
- ☆13Apr 9, 2024Updated 2 years ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆72Dec 8, 2025Updated 5 months ago
- ECCV2024: Adversarial Prompt Tuning for Vision-Language Models☆31Mar 7, 2026Updated 2 months ago
- A simple Computer Vision Framework, mainly based on PyTorch. Including distributed training, logging and so on.☆12Dec 2, 2023Updated 2 years ago
- ☆16Sep 6, 2024Updated last year
- This is the official repository for our paper "Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning" pu…☆35Apr 11, 2026Updated last month
- Composed Video Retrieval☆62May 2, 2024Updated 2 years ago
- [NeurIPS 2025] The official code for "IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation"☆22Jun 5, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆34Jun 3, 2025Updated 11 months ago
- Repository that contains simple scripts to use ROADWork dataset.☆50Updated this week
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆65Nov 30, 2025Updated 5 months ago
- [ICCV2023] Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events☆10Dec 7, 2024Updated last year
- [CVPR2025] Official implementation of RAM☆29Nov 4, 2025Updated 6 months ago
- Multiresolution Learning-based Hybrid Transformer-CNN Model for Anatomical Landmark Detection☆13Nov 5, 2023Updated 2 years ago
- Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"☆28Dec 6, 2023Updated 2 years ago