Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
☆20Dec 21, 2023Updated 2 years ago
Alternatives and similar repositories for SCL
Users that are interested in SCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆15Dec 25, 2023Updated 2 years ago
- ☆73Apr 21, 2026Updated last month
- Official code for the CVPR 2024 Paper "Can Biases in ImageNet Models Explain Generalization?".☆13Jun 24, 2024Updated last year
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆51Apr 19, 2026Updated last month
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆45Aug 14, 2023Updated 2 years ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 3 years ago
- ☆10Jan 9, 2025Updated last year
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Jul 15, 2022Updated 3 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- ☆23Nov 26, 2024Updated last year
- The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted b…☆19Jan 16, 2024Updated 2 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 5 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for WACV 2024 paper ✨ "SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective".☆19Nov 4, 2023Updated 2 years ago
- Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…☆40May 19, 2023Updated 3 years ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- ☆11Feb 14, 2023Updated 3 years ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆142Apr 27, 2026Updated last month
- [ICCV 2025] Prompt-A-Video☆24Feb 2, 2025Updated last year
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- Unofficial implementation of CVPR2021 paper "Perceptual Image Quality Assessment with Transformers"☆76Oct 21, 2021Updated 4 years ago
- ☆22Mar 7, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆12Sep 6, 2023Updated 2 years ago
- [ Official ] - PIPAL Dataset and Training Codebase. ECCV-2020, NTIRE-21/22.☆79Jan 3, 2022Updated 4 years ago
- A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging☆11May 18, 2023Updated 3 years ago
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆31Jul 2, 2025Updated 11 months ago
- DropKAN (Dropout Kolmogorov Arnold Networks)☆18Jun 23, 2025Updated 11 months ago
- The code of MGCC: Text-based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning☆20Feb 26, 2025Updated last year
- ☆12Aug 14, 2019Updated 6 years ago
- ☆20Feb 3, 2025Updated last year
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆28May 14, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [CVPR 2021] Pytorch implementation for Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation☆19May 7, 2021Updated 5 years ago
- Visual self-questioning for large vision-language assistant.☆44Jul 23, 2025Updated 10 months ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆27Jan 6, 2024Updated 2 years ago
- Codebase of 'From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model'☆45May 10, 2026Updated last month
- MultiPriv offers multilingual, multimodal PII entities and prompts for studying privacy risks in LLMs/VLMs. It also supports broader PII-…☆31Dec 10, 2025Updated 6 months ago
- codes for RFSR: Improving ISR Diffusion Models via Reward Feedback Learning☆18Dec 8, 2024Updated last year