codefuse-ai / CodeFuse-MFT-VLMLinks

☆39

Alternatives and similar repositories for CodeFuse-MFT-VLM

Users that are interested in CodeFuse-MFT-VLM are comparing it to the libraries listed below

Sorting:

WePOINTS / WePOINTS
☆186Updated 9 months ago
alipay / Ant-Multi-Modal-Framework
Research Code for Multimodal-Cognition Team in Ant Group
☆169Updated last month
xverse-ai / XVERSE-V-13B
☆79Updated last year
VectorSpaceLab / MegaPairs
[ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
☆234Updated 2 weeks ago
rednote-hilab / dots.vlm1
The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.
☆264Updated last month
bytedance / Valley
Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.
☆256Updated 2 weeks ago
360CVGroup / SEEChat
Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM
☆101Updated last year
yuyq96 / TextHawk
Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
☆64Updated last year
360CVGroup / 360VL
Our 2nd-gen LMM
☆34Updated last year
ggg0919 / cantor
☆90Updated last year
flageval-baai / FlagEval
FlagEval is an evaluation toolkit for AI large foundation models.
☆339Updated 6 months ago
infly-ai / INF-MLLM
☆103Updated this week
cnzzx / VSA
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
☆126Updated last year
thu-ml / zh-clip
☆72Updated 2 years ago
zai-org / GLM-Edge
GLM Series Edge Models
☆154Updated 5 months ago
zai-org / CogCoM
☆215Updated last year
LinWeizheDragon / FLMR
The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.
☆101Updated 5 months ago
RLHF-V / RLAIF-V
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
☆423Updated 6 months ago
RhapsodyAILab / MiniCPM-V-Embedding
☆29Updated last year
Ucas-HaoranWei / Vary-family
☆57Updated last year
MonolithFoundation / Bumblebee
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
☆38Updated last year
scenarios / WeMM
☆87Updated last year
xmu-xiaoma666 / Multimodal-Open-O1
Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…
☆29Updated last year
OpenGVLab / MM-Interleaved
MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer
☆247Updated last year
modelscope / easydistill
a toolkit on knowledge distillation for large language models
☆200Updated 2 weeks ago
VPGTrans / VPGTrans
Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.
☆271Updated 2 years ago
will-singularity / Skywork-MM
Empirical Study Towards Building An Effective Multi-Modal Large Language Model
☆22Updated 2 years ago
AI-Study-Han / Zero-Qwen-VL
训练一个对中文支持更好的LLaVA模型，并开源训练代码和数据。
☆76Updated last year
NExT-ChatV / NExT-Chat
The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".
☆255Updated last year
RLHF-V / RLHF-V
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
☆297Updated last year