360CVGroup / FG-CLIPLinks

New generation of CLIP with fine grained discrimination capability, ICML2025

☆326

Alternatives and similar repositories for FG-CLIP

Users that are interested in FG-CLIP are comparing it to the libraries listed below

Sorting:

nnnth / UFO
[NeurIPS2025 Spotlight 🔥 ] Official implementation of 🛸 "UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Langu…
☆225Updated 3 weeks ago
congvvc / HyperSeg
[CVPR2025] Project for "HyperSeg: Towards Universal Visual Segmentation with Large Language Model".
☆170Updated 10 months ago
PKU-ICST-MIPL / DyFo_CVPR2025
☆91Updated 2 months ago
FoundationVision / GenerateU
[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection
☆182Updated 6 months ago
dvlab-research / VisionReasoner
Vision Manus: Your versatile Visual AI assistant
☆284Updated 2 weeks ago
dvlab-research / Seg-Zero
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
☆531Updated 2 months ago
PKU-ICST-MIPL / Finedefics_ICLR2025
☆73Updated 6 months ago
Liuziyu77 / RAR
The official implementation of RAR
☆92Updated last year
MaverickRen / PixelLM
[CVPR 2024] PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding.
☆238Updated 8 months ago
zamling / PSALM
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
☆256Updated 9 months ago
linyq2117 / TagCLIP
[AAAI 2024] TagCLIP: A Local-to-Global Framework to Enhance Open-Vocabulary Multi-Label Classification of CLIP Without Training
☆103Updated last year
Fantasyele / LLaVA-KD
[ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models
☆103Updated last week
mc-lan / Awesome-MLLM-Segmentation
A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of…
☆139Updated this week
geshang777 / Seg-R1
Official Implementation of "Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning"
☆48Updated 3 months ago
Beckschen / ViTamin
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
☆210Updated last year
wusize / CLIPSelf
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
☆195Updated last year
eternaldolphin / LaMI-DETR
[ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"
☆85Updated 6 months ago
jefferyZhan / Griffon
Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.
☆239Updated 2 months ago
xiaomoguhz / DeCLIP
[CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
☆133Updated 4 months ago
baaivision / DIVA
[ICLR 2025] Diffusion Feedback Helps CLIP See Better
☆289Updated 9 months ago
ant-research / DreamLIP
[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions
☆136Updated 5 months ago
x-cls / superclass
[NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training
☆217Updated 7 months ago
yayafengzi / LMM-HiMTok
HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model
☆68Updated 3 months ago
PolyU-ChenLab / UniPixel
🔮 UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)
☆149Updated this week
Christinepan881 / DINO-R1
☆49Updated 3 months ago
linhuixiao / CLIP-VG
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
☆129Updated 2 months ago
om-ai-lab / GroundVLP
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
☆72Updated last year
shufangxun / LLaVA-MoD
[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
☆204Updated 6 months ago
LeapLabTHU / GSVA
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
☆149Updated last year
zhengli97 / ATPrompt
[ICCV 2025] Official PyTorch Code for "Advancing Textual Prompt Learning with Anchored Attributes"
☆102Updated last week