[CBMI 2024 Best Paper] Official repository of the paper "Is CLIP the main roadblock for fine-grained open-world perception?".
☆32May 12, 2025Updated 11 months ago
Alternatives and similar repositories for FG-CLIP
Users that are interested in FG-CLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detec…☆67Apr 4, 2025Updated last year
- [ICML2024] Official PyTorch implementation of CoMC: Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition☆17Jul 9, 2024Updated last year
- [ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset☆87Aug 6, 2025Updated 8 months ago
- [ICCV 2025] Official repository of the paper "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabular…☆183Nov 10, 2025Updated 5 months ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆34Apr 20, 2026Updated 2 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official Code for "A Likelihood Ratio-Based Approach to Segmenting Unknown Objects" [IJCV 2025]☆15Jun 9, 2025Updated 10 months ago
- [ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression☆45Aug 7, 2025Updated 8 months ago
- This repository is the official implementation of our paper Robust Diffusion Model-Generated Image Detection with CLIP, accepted by MIPR …☆11Jun 13, 2024Updated last year
- LiSu: A Dataset and Method for LiDAR Surface Normal Estimation☆22Nov 30, 2025Updated 5 months ago
- [ECCVW/TWYN 2024 - Best Workshop Paper] Are CLIP features all you need for Universal Synthetic Image Origin Attribution?☆13Mar 27, 2026Updated last month
- code for FineLIP☆40Nov 25, 2025Updated 5 months ago
- Repository for evaluating Pegasus-1 and video-language foundation models☆14Nov 12, 2024Updated last year
- This repository contains the code for our CVPR 2024 paper,☆15Aug 27, 2024Updated last year
- An Evaluation Framework for Temporal Information Extraction Systems☆20Feb 19, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A large scale dataset for Video Captioning in Italian☆13May 16, 2023Updated 2 years ago
- ☆25Jan 19, 2026Updated 3 months ago
- Pytorch implementation for DA-VPT (CVPR2025)☆18Dec 15, 2025Updated 4 months ago
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ☆23May 18, 2025Updated 11 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 9 months ago
- Container Specification, Tutorials and Examples for the AI4EU Experiments docker/grpc format for models☆13Jul 10, 2022Updated 3 years ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [COG24] - Official repository of "OfflineMania: A Benchmark Environment for Offline Reinforcement Learning in Racing Games"☆12Jul 15, 2024Updated last year
- Learning to Count without Annotations☆23May 24, 2024Updated last year
- [ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction☆204Feb 5, 2024Updated 2 years ago
- Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2…☆37Jan 30, 2026Updated 3 months ago
- Multimodal RAG using LlamaIndex, Qdrant, llama.cpp for document QA with local VisonLLM and embedding models☆18Nov 8, 2024Updated last year
- ☆17Oct 22, 2024Updated last year
- ☆13Apr 9, 2024Updated 2 years ago
- ECCV2024: Adversarial Prompt Tuning for Vision-Language Models☆31Mar 7, 2026Updated last month
- A simple Computer Vision Framework, mainly based on PyTorch. Including distributed training, logging and so on.☆12Dec 2, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Un semplice Chatbot in italiano usando Tensorflow☆14Mar 4, 2019Updated 7 years ago
- ☆16Sep 6, 2024Updated last year
- Composed Video Retrieval☆62May 2, 2024Updated 2 years ago
- Examples of Verbalized Machine Learning (VML)☆16Mar 16, 2025Updated last year
- [TCSVT] state-of-the-art open vocabulary detector on COCO/LVIS/V3Det☆34Jun 3, 2025Updated 11 months ago
- [ICLR 2025] - Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion☆65Nov 30, 2025Updated 5 months ago
- [ICCV2023] Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events☆10Dec 7, 2024Updated last year