SamsungLabs / AdaCLIP
This repository contains the code for AdaCLIP, a computation and latency-aware system for pragmatic multimodal video retrieval.
☆10Updated 11 months ago
Alternatives and similar repositories for AdaCLIP
Users that are interested in AdaCLIP are comparing it to the libraries listed below
Sorting:
- [ECCV 2024 Oral] Official implementation of the paper "DEVIAS: Learning Disentangled Video Representations of Action and Scene"☆19Updated 7 months ago
- Official code for Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation☆26Updated 3 weeks ago
- [EMNLP 2024] Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality☆16Updated 7 months ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆16Updated 3 months ago
- Official Repository of Personalized Visual Instruct Tuning☆28Updated 2 months ago
- ☆22Updated 6 months ago
- [ECCV 2024] R2-Bench: Benchmarking the Robustness of Referring Perception Models under Perturbations☆10Updated 9 months ago
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆35Updated last year
- ☆32Updated 3 months ago
- ☆54Updated last year
- Official Implementation of DiffCLIP: Differential Attention Meets CLIP☆26Updated 2 months ago
- [CVPR 2025] Few-shot Recognition via Stage-Wise Retrieval-Augmented Finetuning☆17Updated last month
- Official code for paper "Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models, ICML2024"☆24Updated 3 months ago
- [CVPR 2025] EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance☆15Updated last month
- ☆14Updated 7 months ago
- [NeurIPS 2023] Official Implementation of "PaintSeg: Painting Pixels for Training-free Segmentation"☆14Updated last year
- ☆43Updated 3 weeks ago
- [ICCV 2023] HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness☆17Updated last year
- ☆19Updated 6 months ago
- ☆12Updated 3 months ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆18Updated 2 months ago
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆28Updated last year
- [CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention☆20Updated 2 months ago
- Croc: Pretraining Large Multimodal Models with Cross-Modal Comprehension☆23Updated 6 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆39Updated 5 months ago
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆24Updated 2 weeks ago
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆38Updated 3 months ago
- ☆37Updated 10 months ago
- ☆11Updated 7 months ago
- [CVPR 2024] The official implementation of paper "synthesize, diagnose, and optimize: towards fine-grained vision-language understanding"☆43Updated 2 months ago