QwenLM / Qwen3-VL-EmbeddingView external linksLinks
☆1,010Feb 2, 2026Updated 2 weeks ago
Alternatives and similar repositories for Qwen3-VL-Embedding
Users that are interested in Qwen3-VL-Embedding are comparing it to the libraries listed below
Sorting:
- ☆17Aug 5, 2025Updated 6 months ago
- SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time☆97Jan 1, 2026Updated last month
- [ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆103Dec 8, 2025Updated 2 months ago
- This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]☆569Updated this week
- Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆18,273Jan 30, 2026Updated 2 weeks ago
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆1,430Sep 22, 2025Updated 4 months ago
- [ICCV 2023] Black Box Few-Shot Adaptation for Vision-Language models☆26May 14, 2024Updated last year
- ☆1,778Sep 30, 2025Updated 4 months ago
- Official Pytorch implementation of Super Vision Transformer (IJCV)☆43Aug 3, 2023Updated 2 years ago
- [ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions☆138May 8, 2025Updated 9 months ago
- This is the official Pytorch code for our paper "Artemis: Structured Visual Reasoning for Perception Policy Learning".☆14Dec 4, 2025Updated 2 months ago
- gradio bbox labeling tools☆11May 12, 2023Updated 2 years ago
- ☆12May 23, 2024Updated last year
- [CVPR2025] Code Release of F-LMM: Grounding Frozen Large Multimodal Models☆108May 29, 2025Updated 8 months ago
- Long Context Research☆26Jan 26, 2026Updated 2 weeks ago
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆34Oct 16, 2025Updated 3 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆62Dec 8, 2025Updated 2 months ago
- a ComfyUI plugin that provides a user interface of AudioMass, full-featured web-based audio & waveform editing tool☆27Feb 6, 2026Updated last week
- [SIGIR 2025] This is the code repo for our SIGIR'25 paper: Enhancing the Patent Matching Capability of Large Language Models via Memory G…☆18Apr 22, 2025Updated 9 months ago
- ☆805Jul 8, 2024Updated last year
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆9,806Sep 22, 2025Updated 4 months ago
- LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs☆413Dec 20, 2025Updated last month
- Solve Visual Understanding with Reinforced VLMs☆5,841Oct 21, 2025Updated 3 months ago
- GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning☆2,182Jan 27, 2026Updated 2 weeks ago
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆45Jul 2, 2025Updated 7 months ago
- ☆14Aug 14, 2023Updated 2 years ago
- ☆28Jan 11, 2026Updated last month
- CoV: Chain-of-View Prompting for Spatial Reasoning☆50Jan 23, 2026Updated 3 weeks ago
- InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity☆12Jan 3, 2026Updated last month
- [ICLR'25] Official repository of paper: Ranking-aware adapter for text-driven image ordering with CLIP☆16Apr 17, 2025Updated 9 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆625Oct 29, 2025Updated 3 months ago
- Official implementation of Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model☆229Dec 8, 2025Updated 2 months ago
- [NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training☆227Mar 20, 2025Updated 10 months ago
- Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks☆3,816Updated this week
- The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.☆6,526Aug 7, 2024Updated last year
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,544Jun 14, 2025Updated 8 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Mar 27, 2024Updated last year
- ☆65Jan 6, 2026Updated last month
- 基于 MCP 协议的腾讯云 COS MCP Server,无需编码即可让大模型快速接入腾讯云存储 (COS) 和数据万象 (CI) 能力。☆24Nov 14, 2025Updated 3 months ago