Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
☆20Dec 21, 2023Updated 2 years ago
Alternatives and similar repositories for SCL
Users that are interested in SCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆15Dec 25, 2023Updated 2 years ago
- ☆72Apr 21, 2026Updated last month
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆49Apr 19, 2026Updated last month
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Jun 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆45Aug 14, 2023Updated 2 years ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 2 years ago
- ☆10Jan 9, 2025Updated last year
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Jul 15, 2022Updated 3 years ago
- ☆22Nov 26, 2024Updated last year
- The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted b…☆19Jan 16, 2024Updated 2 years ago
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 5 months ago
- Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…☆40May 19, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- RSTPReid Dataset for Text-based Person Retrieval.☆32Sep 2, 2022Updated 3 years ago
- [AAAI 2026] Segment Anything Across Shots: A Method and Benchmark☆30Nov 16, 2025Updated 6 months ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024☆18Oct 11, 2024Updated last year
- ☆12Feb 14, 2023Updated 3 years ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆132Apr 27, 2026Updated 3 weeks ago
- [ICCV 2025] Prompt-A-Video☆23Feb 2, 2025Updated last year
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- Unofficial implementation of CVPR2021 paper "Perceptual Image Quality Assessment with Transformers"☆76Oct 21, 2021Updated 4 years ago
- ☆22Mar 7, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [ Official ] - PIPAL Dataset and Training Codebase. ECCV-2020, NTIRE-21/22.☆79Jan 3, 2022Updated 4 years ago
- A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging☆11May 18, 2023Updated 3 years ago
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆31Jul 2, 2025Updated 10 months ago
- DropKAN (Dropout Kolmogorov Arnold Networks)☆18Jun 23, 2025Updated 10 months ago
- ☆12Aug 14, 2019Updated 6 years ago
- ☆20Feb 3, 2025Updated last year
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆28May 14, 2024Updated 2 years ago
- attention으로 시계열 예측은 할 수 없을까☆10Apr 30, 2021Updated 5 years ago
- [CVPR 2021] Pytorch implementation for Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation☆19May 7, 2021Updated 5 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Visual self-questioning for large vision-language assistant.☆44Jul 23, 2025Updated 9 months ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- A Survey of Dataset Refinement for Problems in Computer Vision Datasets☆34Sep 12, 2025Updated 8 months ago
- [CVPR 2023] Egocentric Audio-Visual Object Localization☆26Jan 6, 2024Updated 2 years ago
- Codebase of 'From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model'☆45May 10, 2026Updated last week
- MultiPriv offers multilingual, multimodal PII entities and prompts for studying privacy risks in LLMs/VLMs. It also supports broader PII-…☆30Dec 10, 2025Updated 5 months ago
- FedCMR: Federated Cross-Modal Retrieval 的代码(the official implementation of FedCMR: Federated Cross-Modal Retrieval)☆17Oct 17, 2025Updated 7 months ago