w1oves / hqclipLinks
[ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets
☆45Updated this week
Alternatives and similar repositories for hqclip
Users that are interested in hqclip are comparing it to the libraries listed below
Sorting:
- Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"☆96Updated last month
- SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality☆33Updated 8 months ago
- [CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training☆74Updated 3 weeks ago
- [CVPR2025] Official code repository for SeTa: "Scale Efficient Training for Large Datasets"☆19Updated 4 months ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆49Updated last month
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆86Updated 4 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆45Updated last month
- ☆30Updated last year
- ☆78Updated 2 months ago
- ☆45Updated 7 months ago
- ☆65Updated this week
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆25Updated 4 months ago
- [ICCV 2025] GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding☆66Updated last month
- [NeurIPS 2024] official code release for our paper "Revisiting the Integration of Convolution and Attention for Vision Backbone".☆40Updated 6 months ago
- [CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception☆70Updated 2 months ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆98Updated last year
- [BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition☆79Updated 4 months ago
- ☆112Updated last year
- [NeurIPS 2023] FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models☆131Updated last year
- Official repository of InLine attention (NeurIPS 2024)☆52Updated 7 months ago
- [ICCV 2025] Official implementation of LLaVA-KD: A Framework of Distilling Multimodal Large Language Models☆91Updated last month
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆38Updated 5 months ago
- ☆18Updated last year
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆53Updated 4 months ago
- Open-Vocabulary Panoptic Segmentation☆26Updated last month
- Simple script to parallelize download and extract files for SA-1B Dataset.☆37Updated last month
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆105Updated 4 months ago
- [ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆151Updated 2 months ago
- ☆145Updated last month
- EraseAnything, ICML 2025☆24Updated 2 months ago