A simple pytorch implementation of baseline based-on CLIP for Image-text Matching.
☆19May 25, 2023Updated 2 years ago
Alternatives and similar repositories for CLIP_ITM
Users that are interested in CLIP_ITM are comparing it to the libraries listed below
Sorting:
- The code of "Image-text Retrieval via Preserving Main Semantic of Vision" in ICME 2023.☆15Dec 25, 2023Updated 2 years ago
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆27May 14, 2024Updated last year
- The code of the paper "Negative Pre-aware for Noisy Cross-modal Matching" in AAAI 2024.☆30Jul 2, 2025Updated 8 months ago
- [ACL 2025] The official implementation of the paper "PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free".☆60Dec 4, 2025Updated 2 months ago
- [ICCV 2023] With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning.☆19Jun 7, 2024Updated last year
- The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted b…☆19Jan 16, 2024Updated 2 years ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆20Mar 28, 2024Updated last year
- Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.☆26Aug 2, 2024Updated last year
- [ICRA 2024] WLST: Weak Labels Guided Self-training for Weakly-supervised Domain Adaptation on 3D Object Detection☆12Feb 6, 2024Updated 2 years ago
- [ICLR 2025] Official PyTorch Implementation for CPE: Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Ga…☆12Apr 7, 2025Updated 10 months ago
- [ICLR 2025] Official code repository for "TULIP: Token-length Upgraded CLIP"☆33Jan 26, 2026Updated last month
- Official repository of StablePrompt: Automatic Prompt Tuning using Reinforcement Learning for Large Language Model (EMNLP 2024)☆42Feb 11, 2025Updated last year
- Official Implementation of Attentive Mask CLIP (ICCV2023, https://arxiv.org/abs/2212.08653)☆35May 29, 2024Updated last year
- Official Repo For AAAI 2026 Accepted Paper "Rethinking the Spatio-Temporal Alignment of End-to-End 3D Perception"☆28Jan 13, 2026Updated last month
- Implementation of the CVPR2025 paper LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty.☆17Sep 10, 2025Updated 5 months ago
- AAPL: Adding Attributes to Prompt Learning for Vision-Language Models (CVPRw 2024)☆34May 8, 2024Updated last year
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆46Dec 1, 2024Updated last year
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆44Apr 30, 2023Updated 2 years ago
- Improving Continuous Sign Language Recognition with Adapted Image Models☆14Nov 10, 2025Updated 3 months ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- [NeurIPS 2025] The official implementation of the paper "DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agen…☆39Feb 14, 2026Updated 2 weeks ago
- [ICLR 2026] SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models☆74Jan 29, 2026Updated last month
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- [ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…☆39Jun 30, 2024Updated last year
- [CVPR2025] Official code for Lost in Translation Found in Context☆23Jan 14, 2026Updated last month
- AlignCLIP: Improving Cross-Modal Alignment in CLIP (ICLR 2025)☆59Mar 1, 2025Updated last year
- Official Implementation (PyTorch) of "Point Cloud Augmentation with Weighted Local Transformations", ICCV 2021☆41Mar 2, 2022Updated 4 years ago
- The official implementation of Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion [AAAI'2…☆15Feb 2, 2026Updated last month
- [NeurIPS 23] Characterizing OOD Error via Optimal Transport☆13Nov 19, 2023Updated 2 years ago
- [WACV 2025-Oral Presentation] Test-Time Adaptation in Point Clouds: Leveraging Sampling Variation with Weight Averaging☆12Mar 31, 2025Updated 11 months ago
- ☆11May 17, 2024Updated last year
- ☆11Aug 7, 2025Updated 6 months ago
- [ECCV 2024] R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection☆18Jan 6, 2025Updated last year
- Self-supervised adversarial masking for point clouds☆11Jul 12, 2023Updated 2 years ago
- [ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization☆12Oct 8, 2024Updated last year
- [NeurIPS'25] Backdoor Cleaning without External Guidance in MLLM Fine-tuning☆17Oct 13, 2025Updated 4 months ago
- Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression☆12Mar 17, 2025Updated 11 months ago
- Extended Implementation of FastLGS☆16Dec 17, 2024Updated last year
- ☆10Apr 8, 2024Updated last year