☆128Dec 26, 2025Updated 2 months ago
Alternatives and similar repositories for SuperCLIP
Users that are interested in SuperCLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆60May 13, 2025Updated 10 months ago
- [AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices☆95Nov 30, 2025Updated 3 months ago
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆75Jun 26, 2025Updated 8 months ago
- [NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning☆192Nov 7, 2025Updated 4 months ago
- [AAAI'25 Oral] NightReID: A Large-Scale Nighttime Person Re-Identification Benchmark☆11Jun 10, 2025Updated 9 months ago
- [CVPR 2025] Official repository of the paper "Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation"☆125Oct 23, 2025Updated 5 months ago
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆143Apr 25, 2025Updated 10 months ago
- [NeurIPS 2025] VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning☆32Dec 9, 2025Updated 3 months ago
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆153Jan 10, 2026Updated 2 months ago
- A code base for the official XS-VID dataset baseline method YOLOFT☆19Dec 24, 2024Updated last year
- The first decoder-only multimodal state space model☆100May 19, 2025Updated 10 months ago
- Project that regroup the state-of-the-art knowledge distillation approaches for unsupervised anomaly detection☆14Oct 10, 2025Updated 5 months ago
- Featurized Query R-CNN☆45Jun 17, 2022Updated 3 years ago
- ☆17Nov 17, 2023Updated 2 years ago
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆225Mar 20, 2025Updated last year
- [CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding☆209Jan 5, 2026Updated 2 months ago
- Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition☆42Dec 5, 2024Updated last year
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Jun 7, 2023Updated 2 years ago
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆116Jun 17, 2024Updated last year
- Visual Generation Tuning☆99Jan 27, 2026Updated last month
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆30Feb 28, 2026Updated 3 weeks ago
- This repository contains the **official implementation** of the paper: "VL2Lite: Task-Specific Knowledge Distillation from Large Vision-…☆16Mar 23, 2025Updated last year
- [ACL 2025] ⚖️ Temporally-aware MLLM for Biomedical Radiology Analysis and Report Generation. Flexible toolkit with MLLM backbone support,…☆28Updated this week
- ☆22Nov 27, 2025Updated 3 months ago
- [ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition☆58Apr 8, 2025Updated 11 months ago
- Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing☆37Apr 6, 2025Updated 11 months ago
- ☆12Aug 10, 2022Updated 3 years ago
- A lightweight Text-to-Image Retrieval model [Web App]☆29Dec 6, 2024Updated last year
- ☆14Jul 1, 2025Updated 8 months ago
- [ACL 2025] RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection☆34Jul 23, 2025Updated 8 months ago
- [CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆179Mar 1, 2025Updated last year
- ☆14Dec 11, 2024Updated last year
- [IJCV 2024]☆21Nov 11, 2024Updated last year
- [CVPR 2025] Offical implementation of the paper "Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters The…☆31Mar 12, 2026Updated last week
- ☆10Dec 16, 2023Updated 2 years ago
- [ECCV 2024🔥] The official code for the paper DiffFAS: Face Anti-Spoofing via Generative Diffusion Models.☆42Sep 23, 2024Updated last year
- [EMNLP25 Main]The official code of "Gradient-Attention Guided Dual-Masking Synergetic Framework for Robust Text-based Person Retrieval"☆22Mar 11, 2026Updated last week
- VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning☆326Feb 9, 2026Updated last month
- Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning☆321Mar 26, 2025Updated 11 months ago