SunzeY / AlphaCLIPLinks
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
β831Updated last month
Alternatives and similar repositories for AlphaCLIP
Users that are interested in AlphaCLIP are comparing it to the libraries listed below
Sorting:
- [CVPR 2024 π₯] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses thaβ¦β893Updated last month
- [ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"β826Updated 11 months ago
- [Pattern Recognition 25] CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasksβ430Updated 4 months ago
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"β481Updated last year
- [ECCV 2024] Tokenize Anything via Promptingβ585Updated 7 months ago
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"β452Updated last month
- Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]β914Updated last year
- [CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloadinβ¦β226Updated 9 months ago
- VisionLLM Seriesβ1,084Updated 4 months ago
- [ICLR 2025] Diffusion Feedback Helps CLIP See Betterβ283Updated 5 months ago
- Recent LLM-based CV and related works. Welcome to comment/contribute!β869Updated 4 months ago
- [ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of β¦β488Updated 11 months ago
- VisionLLaMA: A Unified LLaMA Backbone for Vision Tasksβ385Updated last year
- LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.β531Updated last week
- [CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"β774Updated last year
- [CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Promptsβ325Updated 11 months ago
- This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the fuβ¦β155Updated last year
- [ICCV 2023] Official implementation of the paper "A Simple Framework for Open-Vocabulary Segmentation and Detection"β719Updated last year
- β536Updated 2 years ago
- β628Updated last year
- Official Open Source code for "Scaling Language-Image Pre-training via Masking"β426Updated 2 years ago
- [NeurIPS2023] DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Modelsβ321Updated last year
- PyTorch implementation of RCG https://arxiv.org/abs/2312.03701β917Updated 9 months ago
- Experiment on combining CLIP with SAM to do open-vocabulary image segmentation.β374Updated 2 years ago
- Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Seriesβ979Updated 5 months ago
- This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.β727Updated last year
- LaVIT: Empower the Large Language Model to Understand and Generate Visual Contentβ583Updated 9 months ago
- [ECCV2024] VideoMamba: State Space Model for Efficient Video Understandingβ977Updated last year
- NeurIPS 2024 Paper: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editingβ554Updated 8 months ago
- This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detectioβ¦β638Updated last year