hustvl/SuperCLIP

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hustvl/SuperCLIP)

hustvl / SuperCLIP

☆140

Alternatives and similar repositories for SuperCLIP

Users that are interested in SuperCLIP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hustvl / TBCM
View on GitHub
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
☆21Dec 16, 2025Updated 7 months ago
hustvl / InfiniteVL
View on GitHub
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
☆110Jul 7, 2026Updated 2 weeks ago
hustvl / MobileI2V
View on GitHub
[ArXiv 2025] MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices
☆87May 20, 2026Updated 2 months ago
hustvl / Spa3R
View on GitHub
Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning
☆51Mar 25, 2026Updated 3 months ago
hustvl / VGT
View on GitHub
Visual Generation Tuning
☆101Apr 16, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
hustvl / ViTGaze
View on GitHub
Official code of "ViTGaze: Gaze Following with Interaction Features in Vision Transformers"
☆62Mar 3, 2025Updated last year
hustvl / MaTVLM
View on GitHub
☆62May 13, 2025Updated last year
hustvl / Turbo-VAED
View on GitHub
[AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
☆131Jul 10, 2026Updated last week
hustvl / OpenInst
View on GitHub
☆17Nov 17, 2023Updated 2 years ago
hustvl / mmMamba
View on GitHub
The first decoder-only multimodal state space model
☆104May 19, 2025Updated last year
hustvl / Snap-Snap
View on GitHub
The repository of "Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds"
☆40Sep 1, 2025Updated 10 months ago
hustvl / ViG
View on GitHub
[AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention
☆116Jun 17, 2024Updated 2 years ago
ZrH42 / UniX
View on GitHub
☆31Mar 29, 2026Updated 3 months ago
hustvl / GaussTR
View on GitHub
[CVPR 2025] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
☆217Jan 5, 2026Updated 6 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hustvl / RAD
View on GitHub
[NeurIPS 2025] RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
☆265Apr 17, 2026Updated 3 months ago
hustvl / EVA-X
View on GitHub
[Nature Portfolio, npj DigitalMed] EVA-X: A foundation model for general chest X-ray analysis with self-supervised learning
☆100Jun 12, 2026Updated last month
x-cls / superclass
View on GitHub
[NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training
☆223Mar 20, 2025Updated last year
YanFangCS / GenLIP
View on GitHub
Official repo for "Let ViT Speak: Generative Language-Image Pre-training"
☆133Jun 10, 2026Updated last month
google-deepmind / tips
View on GitHub
TIPSv2 (CVPR'26) and TIPS (ICLR'25)
☆574Jun 1, 2026Updated last month
hustvl / OmniMamba
View on GitHub
[ECCV 2026] OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
☆126Apr 25, 2025Updated last year
MiniMax-AI / VTP
View on GitHub
[ECCV 2026] Towards Scalable Pre-training of Visual Tokenizers for Generation
☆495Apr 15, 2026Updated 3 months ago
XLearning-SCU / 2026-AAAI-SCAN
View on GitHub
Official implementation of the paper “Endowing Vision-Language Models with System 2 Thinking for Fine-Grained Visual Recognition,” AAAI 2…
☆44Jan 30, 2026Updated 5 months ago
VisionXLab / ProCLIP
View on GitHub
Official PyTorch implementation of ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder
☆25Dec 4, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
GaryGuTC / UniME-v2
View on GitHub
[AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"
☆74Dec 8, 2025Updated 7 months ago
hustvl / DiG
View on GitHub
[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
☆184Mar 1, 2025Updated last year
peterant330 / KUEA
View on GitHub
[ICML'25] Kernel-based Unsupervised Embedding Alignment for Enhanced Visual Representation in Vision-language Models
☆23Sep 7, 2025Updated 10 months ago
hustvl / LENS
View on GitHub
[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning
☆136Dec 3, 2025Updated 7 months ago
xiaomoguhz / DeCLIP
View on GitHub
[CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
☆169Jan 10, 2026Updated 6 months ago
ilias-vrg / ilias
View on GitHub
ILIAS: Instance-Level Image retrieval At Scale
☆38Mar 13, 2026Updated 4 months ago
gjhhust / YOLOFT
View on GitHub
A code base for the official XS-VID dataset baseline method YOLOFT
☆22Dec 24, 2024Updated last year
stone96123 / MDReID
View on GitHub
[NeruIPS2025] MDReID: Modality-Decoupled Learning for Any-to-Any Multi-Modal Object Re-Identification
☆16Jul 3, 2026Updated 2 weeks ago
m2diffuser / M2Diffuser
View on GitHub
Official implementation of T-PAMI25 paper "M²Diffuser: Diffusion-based Trajectory Optimization for Mobile Manipulation in 3D Scenes"
☆122Jun 17, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
cwangrun / CheXficient
View on GitHub
CheXficient
☆15Jun 28, 2026Updated 3 weeks ago
hustvl / CircuitFormer
View on GitHub
[NeurIPS 2023] CircuitFormer: Circuit as Set of Points
☆38Nov 22, 2023Updated 2 years ago
Mid-Push / SmartCLIP
View on GitHub
SmartCLIP: A training method to improve CLIP with both short and long texts
☆43Jun 18, 2025Updated last year
msm8976 / NightReID
View on GitHub
[AAAI'25 Oral] NightReID: A Large-Scale Nighttime Person Re-Identification Benchmark
☆11Jun 10, 2025Updated last year
hustvl / Featurized-QueryRCNN
View on GitHub
Featurized Query R-CNN
☆46Jun 17, 2022Updated 4 years ago
serizba / cliquemining
View on GitHub
Close, But Not There: Boosting Geographic Distance Sensitivity in Visual Place Recognition
☆42Dec 5, 2024Updated last year
csiro-robotics / Pair-VPR
View on GitHub
[IEEE RA-L 2025] The official repository for Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Reco…
☆63May 22, 2026Updated 2 months ago