Seeing What You Miss: Vision-Language Pre-training with Semantic Completion Learning
☆20Dec 21, 2023Updated 2 years ago
Alternatives and similar repositories for SCL
Users that are interested in SCL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆15Dec 25, 2023Updated 2 years ago
- ☆74Apr 21, 2026Updated 2 months ago
- Official code for the CVPR 2024 Paper "Can Biases in ImageNet Models Explain Generalization?".☆13Jun 24, 2024Updated 2 years ago
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆55Apr 19, 2026Updated 2 months ago
- [CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"☆12Feb 27, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Jun 11, 2023Updated 3 years ago
- ☆45Aug 14, 2023Updated 2 years ago
- [IJCAI 2023] Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment☆53Apr 9, 2024Updated 2 years ago
- A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.☆19May 25, 2023Updated 3 years ago
- ☆10Jan 9, 2025Updated last year
- Character Grounding and Re-Identification in Story of Videos and Text Descriptions☆10Jan 17, 2021Updated 5 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆33Jul 15, 2022Updated 3 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- ☆23Nov 26, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆141Dec 16, 2025Updated 6 months ago
- Code for WACV 2024 paper ✨ "SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective".☆19Nov 4, 2023Updated 2 years ago
- Implementation for the paper "Reliable Visual Question Answering Abstain Rather Than Answer Incorrectly" (ECCV 2022: https//arxiv.org/abs…☆41May 19, 2023Updated 3 years ago
- [AAAI 2026] Segment Anything Across Shots: A Method and Benchmark☆29Nov 16, 2025Updated 7 months ago
- ☆11Feb 14, 2023Updated 3 years ago
- [CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs☆149Apr 27, 2026Updated 2 months ago
- [ICCV 2025] Prompt-A-Video☆24Feb 2, 2025Updated last year
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆37Nov 27, 2024Updated last year
- ☆22Mar 7, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆12Sep 6, 2023Updated 2 years ago
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆31Jul 2, 2025Updated 11 months ago
- DropKAN (Dropout Kolmogorov Arnold Networks)☆19Jun 23, 2025Updated last year
- The code of MGCC: Text-based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning☆20Feb 26, 2025Updated last year
- ☆21Feb 3, 2025Updated last year
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆28May 14, 2024Updated 2 years ago
- [CVPR 2021] Pytorch implementation for Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation☆19May 7, 2021Updated 5 years ago
- Visual self-questioning for large vision-language assistant.☆44Jul 23, 2025Updated 11 months ago
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Survey of Dataset Refinement for Problems in Computer Vision Datasets☆34Sep 12, 2025Updated 9 months ago
- MultiPriv offers multilingual, multimodal PII entities and prompts for studying privacy risks in LLMs/VLMs. It also supports broader PII-…☆32Updated this week
- codes for RFSR: Improving ISR Diffusion Models via Reward Feedback Learning☆18Dec 8, 2024Updated last year
- FedCMR: Federated Cross-Modal Retrieval 的代码(the official implementation of FedCMR: Federated Cross-Modal Retrieval)☆17Oct 17, 2025Updated 8 months ago
- Pytorch Sketch Classification☆11Apr 14, 2018Updated 8 years ago
- Code release for the paper "Egocentric Video Task Translation" (CVPR 2023 Highlight)☆34Jun 12, 2023Updated 3 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 10 months ago