All-In-One VLM: Image + Video + Transfer to Other Languages / Domains (TPAMI 2023)
☆169Aug 22, 2024Updated last year
Alternatives and similar repositories for X2-VLM
Users that are interested in X2-VLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)☆502Nov 25, 2022Updated 3 years ago
- Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training (ACL 2023))☆93Jun 12, 2023Updated 2 years ago
- This repo contains codes and instructions for baselines in the VLUE benchmark.☆41Jul 16, 2022Updated 3 years ago
- ☆15Apr 30, 2022Updated 3 years ago
- [ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…☆52Jul 3, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- The code of the paper of "A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval" accepted b…☆19Jan 16, 2024Updated 2 years ago
- Code for ALBEF: a new vision-language pre-training method☆1,757Sep 20, 2022Updated 3 years ago
- Grounded Language-Image Pre-training☆2,585Jan 24, 2024Updated 2 years ago
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated 10 months ago
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆55Mar 28, 2024Updated 2 years ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,655Aug 1, 2024Updated last year
- Code for "AVG-LLaVA: A Multimodal Large Model with Adaptive Visual Granularity"☆33Oct 12, 2024Updated last year
- [CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"☆808Mar 20, 2024Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR …☆292Jun 7, 2023Updated 2 years ago
- Implementation of our paper, Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination..☆20Dec 3, 2023Updated 2 years ago
- Code for EMNLP 2022 paper “Distilled Dual-Encoder Model for Vision-Language Understanding”☆31May 1, 2023Updated 2 years ago
- [NeurIPS 2022] "A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models", Yuanxin Liu, Fandong Meng, Zheng Lin, Jiangnan Li…☆21Jan 9, 2024Updated 2 years ago
- (ECCVW 2025)GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest☆553Jun 3, 2025Updated 9 months ago
- ☆25Mar 4, 2022Updated 4 years ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆62Nov 7, 2024Updated last year
- code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022☆268Oct 2, 2024Updated last year
- Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning☆24Sep 9, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training☆30Jun 20, 2023Updated 2 years ago
- ☆35Feb 5, 2024Updated 2 years ago
- Official Code for the ICCV23 Paper: "LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval…☆40Oct 14, 2023Updated 2 years ago
- ☆807Jul 8, 2024Updated last year
- ☆33Sep 27, 2024Updated last year
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆281Mar 25, 2023Updated 3 years ago
- ☆45Aug 14, 2023Updated 2 years ago
- [ECCVW 2024 -- ORAL] Official repository of paper titled "Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors".☆12Oct 11, 2024Updated last year
- Implementation of our paper, 'Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval.'☆28Dec 3, 2023Updated 2 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆30Jan 18, 2026Updated 2 months ago
- Set of scripts and instructions for sub-selecting and formatting raw data exported by the Canfield ISIC2024 Tile Export Tool. The resulti…☆11May 8, 2024Updated last year
- Code release for "Clue Me In: Semi-Supervised FGVC with Out-of-Distribution Data".☆13Apr 11, 2022Updated 3 years ago
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆31Nov 13, 2025Updated 4 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆178Oct 1, 2024Updated last year
- [CVPR 2024] Official implementation of the paper "Visual In-context Learning"☆531Apr 8, 2024Updated last year
- NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024☆1,826Nov 27, 2025Updated 4 months ago