[CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
☆32May 12, 2025Updated last year
Alternatives and similar repositories for FG-CLIP
Users that are interested in FG-CLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detec…☆67Apr 4, 2025Updated last year
- [WACV 2026] Official implementation of the paper: “CountingDINO: A Training-free Pipeline for Exemplar-based Class-Agnostic Counting”☆61Mar 8, 2026Updated 3 months ago
- [ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition☆17Jul 9, 2024Updated last year
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆87Aug 6, 2025Updated 10 months ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆34Apr 20, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression☆47Aug 7, 2025Updated 10 months ago
- Official Code for "A Likelihood Ratio-Based Approach to Segmenting Unknown Objects" [IJCV 2025]☆15Jun 9, 2025Updated last year
- [CVPR 2026] OccAny: Generalized Unconstrained Urban 3D Occupancy. The first Unified Framework for Generalized 3D Occupancy Prediction. Su…☆118Apr 1, 2026Updated 2 months ago
- This repository is the official implementation of our paper Robust Diffusion Model-Generated Image Detection with CLIP, accepted by MIPR …☆11Jun 13, 2024Updated 2 years ago
- we introduce R2S100K---a large-scale dataset and benchmark for training and evaluation of road segmentation in challenging unstructured r…☆18Jan 28, 2026Updated 4 months ago
- LiSu: A Dataset and Method for LiDAR Surface Normal Estimation☆23Nov 30, 2025Updated 6 months ago
- This repository contains the code for our CVPR 2024 paper,☆15Aug 27, 2024Updated last year
- An Evaluation Framework for Temporal Information Extraction Systems☆21Feb 19, 2026Updated 3 months ago
- A large scale dataset for Video Captioning in Italian☆13May 16, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Pytorch implementation for DA-VPT (CVPR2025)☆19Dec 15, 2025Updated 5 months ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆23May 18, 2025Updated last year
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆18Apr 2, 2025Updated last year
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 10 months ago
- [COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"☆12Jul 15, 2024Updated last year
- [ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction☆207Feb 5, 2024Updated 2 years ago
- Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2…☆41Jan 30, 2026Updated 4 months ago
- Multimodal RAG using LlamaIndex, Qdrant, llama.cpp for document QA with local VisonLLM and embedding models☆20Nov 8, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆17Oct 22, 2024Updated last year
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆74Dec 8, 2025Updated 6 months ago
- A simple Computer Vision Framework, mainly based on PyTorch. Including distributed training, logging and so on.☆12Dec 2, 2023Updated 2 years ago
- ECCV2024: Adversarial Prompt Tuning for Vision-Language Models☆31Mar 7, 2026Updated 3 months ago
- Un semplice Chatbot in italiano usando Tensorflow☆14Mar 4, 2019Updated 7 years ago
- [MICCAI 2023] (early accept) UOD: universal oneshot detection of anatomical landmarks. https://arxiv.org/abs/2306.07615☆12Jan 4, 2024Updated 2 years ago
- This is the official repository for our paper "Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning" pu…☆38Apr 11, 2026Updated 2 months ago
- Composed Video Retrieval☆62May 2, 2024Updated 2 years ago
- Examples of Verbalized Machine Learning (VML)☆16Mar 16, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Pothole Detection using Ultralytics YOLOv8.☆34Sep 30, 2024Updated last year
- [ICCV2023] Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events☆10Dec 7, 2024Updated last year
- A PyTorch implementation of NormSoftmax based on BMVC 2019 paper "Classification is a Strong Baseline for Deep Metric Learning"☆10Mar 15, 2020Updated 6 years ago
- Official implementation of the paper "ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval"☆28Dec 6, 2023Updated 2 years ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆18Sep 11, 2024Updated last year
- Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)☆147Jan 5, 2026Updated 5 months ago
- Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)☆36May 29, 2024Updated 2 years ago