ShaohonChen/Qwen3-SmVL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ShaohonChen/Qwen3-SmVL)

ShaohonChen / Qwen3-SmVL

将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调

☆602

Alternatives and similar repositories for Qwen3-SmVL

Users that are interested in Qwen3-SmVL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jingyaogong / minimind-v
View on GitHub
👀「大模型」2小时从0训练65M参数的视觉多模态VLM！Train a 65M-parameter VLM from scratch in just 2h!
☆8,367Jun 28, 2026Updated 3 weeks ago
7Alive7 / VLM-Finetuning
View on GitHub
这是一个不基于任何框架实现的从0到1的VLM finetune（包括Pre-train和SFT）
☆39Aug 22, 2025Updated 11 months ago
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆14,952Updated this week
wyf3 / llm_related
View on GitHub
复现大模型相关算法及一些学习记录
☆3,465Jul 2, 2026Updated 3 weeks ago
QwenLM / Qwen3-VL
View on GitHub
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
☆19,667Jan 30, 2026Updated 5 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
huggingface / nanoVLM
View on GitHub
The simplest, fastest repository for training/finetuning small-sized VLMs.
☆4,966Oct 27, 2025Updated 8 months ago
hiyouga / EasyR1
View on GitHub
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
☆5,082Updated this week
ZHEQIUSHUI / CLIP-ONNX-AX650-CPP
View on GitHub
c++实现的clip推理，模型有一点点改动，但是不大，改动和导出模型的代码可以在readme里找到，模型文件都在Releases里，包括AX650的模型。新增支持ChineseCLIP
☆31Jun 19, 2025Updated last year
Liuziyu77 / Visual-RFT
View on GitHub
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
☆2,263Oct 29, 2025Updated 8 months ago
IDEA-Research / Rex-Omni
View on GitHub
[CVPR2026] Detect Anything via Next Point Prediction
☆1,516Feb 22, 2026Updated 5 months ago
EvolvingLMMs-Lab / LLaVA-OneVision-2
View on GitHub
Fully Open Framework for Democratized Multimodal Training
☆1,152Updated this week
om-ai-lab / VLM-R1
View on GitHub
Solve Visual Understanding with Reinforced VLMs
☆6,015Jul 7, 2026Updated 2 weeks ago
Ostrakon-VL / Ostrakon-VL
View on GitHub
☆19Mar 19, 2026Updated 4 months ago
spacewalk01 / nanosam-cpp
View on GitHub
C++ TensorRT Implementation of NanoSAM
☆53Dec 28, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Coobiw / MPP-LLaVA
View on GitHub
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conv…
☆685Mar 10, 2025Updated last year
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,667Updated this week
shenduldh / CosyVoice-Lightning
View on GitHub
Lightning-responsive CosyVoice streaming API based on FastAPI.
☆28Apr 27, 2026Updated 3 months ago
chequanghuy / TriLiteNet
View on GitHub
☆47Jul 15, 2025Updated last year
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,523Updated this week
Guldfisk5682 / TinyLLaVA-Qwen3
View on GitHub
一个低成本、易于上手的多模态大模型学习项目。基于Qwen3-0.6B和CLIP构建，使用LLaVA架构和LoRA微调，在消费级16G显卡上数小时即可完成训练
☆51Sep 15, 2025Updated 10 months ago
828Tina / textvqa_grounding_task_qwen2.5-vl-ft
View on GitHub
☆93May 20, 2025Updated last year
jingyaogong / minimind
View on GitHub
🧠「大模型」2小时完全从0训练64M的小参数LLM！Train a 64M-parameter LLM from scratch in just 2h!
☆53,860Updated this week
2U1 / Qwen-VL-Series-Finetune
View on GitHub
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
☆1,943Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
BaofengZan / GOT-OCRv2-onnx
View on GitHub
用于学习GOT/Qwen/OnnxLLm
☆55Oct 8, 2024Updated last year
Zeyi-Lin / Qwen3-Medical-SFT
View on GitHub
Qwen3 Fine-tuning: Medical R1 Style Chat
☆332May 31, 2025Updated last year
Meituan-AutoML / MobileVLM
View on GitHub
Strong and Open Vision Language Assistant for Mobile Devices
☆1,366Apr 15, 2024Updated 2 years ago
MoonshotAI / Kimi-VL
View on GitHub
Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities
☆1,210Jul 15, 2025Updated last year
JIA-Lab-research / VisionReasoner
View on GitHub
[ICLR 2026] VisionReasoner: Unified Reasoning-Integrated Visual Perception via Reinforcement Learning
☆348Feb 9, 2026Updated 5 months ago
huggingface / smollm
View on GitHub
Everything about the SmolLM and SmolVLM family of models
☆3,852May 26, 2026Updated 2 months ago
datawhalechina / self-llm
View on GitHub
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程
☆31,429Jul 15, 2026Updated last week
alibaba-damo-academy / VL-Cogito
View on GitHub
☆24Nov 4, 2025Updated 8 months ago
l-sf / Linfer
View on GitHub
基于TensorRT的C++高性能推理库，Yolov10, YoloPv2，Yolov5/7/X/8，RT-DETR，单目标跟踪OSTrack、LightTrack。
☆235Jun 12, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
triple-mu / HunyuanDiT-TensorRT-libtorch
View on GitHub
HunyuanDiT with TensorRT and libtorch
☆18May 22, 2024Updated 2 years ago
isLinXu / paper-read-notes
View on GitHub
paper-read-notes
☆13Sep 26, 2024Updated last year
TeenLucifer / vlm_reproduce
View on GitHub
☆40Nov 16, 2025Updated 8 months ago
hsdslab / MaxMinChunking
View on GitHub
Efficient document processing for RAG using MaxMin Chunking.
☆16May 20, 2026Updated 2 months ago
Intellindust-AI-Lab / DEIMv2
View on GitHub
[DEIMv2] Real Time Object Detection Meets DINOv3
☆1,949Mar 24, 2026Updated 4 months ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,853Jul 14, 2026Updated last week
TeenLucifer / llm_base
View on GitHub
Pretrain、Posttrain、RAG、Agent等大模型相关的基础项目合集
☆40Dec 7, 2025Updated 7 months ago