SamsungLabs / AdaCLIP
This repository contains the code for AdaCLIP, a computation and latency-aware system for pragmatic multimodal video retrieval.
☆10Updated 7 months ago
Alternatives and similar repositories for AdaCLIP:
Users that are interested in AdaCLIP are comparing it to the libraries listed below
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆32Updated 8 months ago
- [ICCV 2023] HybridAugment++: Unified Frequency Spectra Perturbations for Model Robustness☆17Updated last year
- Official Repository of Personalized Visual Instruct Tuning☆26Updated 2 months ago
- PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆15Updated this week
- SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation (arXiv: 2410.12761)☆19Updated 2 months ago
- [ECCV 2024] Official repository for "DataDream: Few-shot Guided Dataset Generation"☆26Updated 5 months ago
- ☆18Updated 2 months ago
- [CVPR 2023] Pytorch Code of MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering☆16Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆29Updated 10 months ago
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆31Updated last month
- Unifying Specialized Visual Encoders for Video Language Models☆13Updated this week
- Data-Efficient Multimodal Fusion on a Single GPU☆51Updated 8 months ago
- MuCR is a benchmark designed to evaluate Vision Large Language Models' (VLLMs) ability to infer causal relationships using only visual cu…☆14Updated 4 months ago
- Disentangled Pre-training for Human-Object Interaction Detection☆18Updated 2 months ago
- [CVPR2024 Highlight] Official implementation for Transferable Visual Prompting. The paper "Exploring the Transferability of Visual Prompt…☆34Updated 3 weeks ago
- Official Implementation (Pytorch) of "EfficientViM: Efficient Vision Mamba with Hidden State Mixer-based State Space Duality"☆16Updated last month
- [CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.☆25Updated 7 months ago
- ☆21Updated last month
- Multimodal Video Understanding Framework (MVU)☆26Updated 7 months ago
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆19Updated 2 months ago
- Official implementation of CVPR 2024 paper "Prompt Learning via Meta-Regularization".☆25Updated 4 months ago
- [CVPR 2024 Highlight] ImageNet-D☆40Updated 2 months ago
- [ICLR 2024] Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models.☆65Updated 5 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆30Updated 6 months ago
- Retrieval-Augmented Personalization☆11Updated last month
- [CVPR 2024] Code and models for pi-ViT, a video transformer for understanding activities of daily living☆17Updated 4 months ago
- Official Implementation (Pytorch) of "DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Represe…☆24Updated 6 months ago
- ☆28Updated this week
- ☆12Updated 2 months ago