percent4 / yi_vl_experimentLinks

本项目是关于Yi的多模态系列模型，如Yi-VL-6B/34B等的实验与应用。

☆14

Alternatives and similar repositories for yi_vl_experiment

Users that are interested in yi_vl_experiment are comparing it to the libraries listed below

Sorting:

360CVGroup / SEEChat
Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM
☆101Updated last year
lin-honghui / data-competition-calendar
国内外数据竞赛资讯整理
☆18Updated 3 years ago
TencentARC-QQ / QA-CLIP
Chinese CLIP models with SOTA performance.
☆58Updated 2 years ago
huizhang0110 / catvision
A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…
☆14Updated last year
zhongyy / VIoTGPT
Code of AAAI2025 Paper 《VIoTGPT: Learning to Schedule Vision Tools in LLMs towards Intelligent Video Internet of Things》
☆14Updated 8 months ago
hubo0417 / EasyGC
集成了LLM与SDXL的AIGC应用程序
☆29Updated last year
idstcv / Dash
Tensorflow implementation for Dash
☆32Updated 3 years ago
Pillars-Creation / Visualglm-image-to-text
补充了一些Visualglm缺少的文件，可以对Visualglm进行训练，实例中是对人脸做了面相的识别
☆13Updated 2 years ago
reilxlx / llava-Qwen2-7B-Instruct-Chinese-CLIP
模型 llava-Qwen2-7B-Instruct-Chinese-CLIP 增强中文文字识别能力和表情包内涵识别能力，接近gpt4o、claude-3.5-sonnet的识别水平！
☆24Updated last year
PCIResearch / TransCore-M
Large Multimodal Model
☆15Updated last year
JunweiLiang / aicity_action
Code and model for the AI City Challenge (CVPR 2022) Track 3 Action Detection (Naturalistic Driving Action Recognition)
☆28Updated 2 years ago
pleisto / yuren-baichuan-7b
基于baichuan-7b的开源多模态大语言模型
☆72Updated last year
Ucas-HaoranWei / Vary-family
☆57Updated last year
hpc203 / dbnet-barcode
使用DBNet检测条形码，包含C++和Python两种版本的程序
☆37Updated 4 years ago
scchy / XtunerGUI
Xtuner Factory
☆34Updated last year
IDEA-CCNL / Real-Gemini
Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架，通过文本…
☆25Updated last year
AXERA-TECH / Qwen2.5-VL-3B-Instruct.axera
Demo for Qwen2.5-VL-3B-Instruct on Axera device.
☆14Updated last month
isLinXu / vision-process-webui
💡💡💡awesome compute vision app in gradio
☆55Updated last year
linjing7 / grounded-sam-osx
Submodule for Grounded-SAM
☆12Updated 2 years ago
MonolithFoundation / Bumblebee
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
☆38Updated last year
AXERA-TECH / CLIP-ONNX-AX650-CPP
☆27Updated 3 months ago
Mingrui-Li / Qwen-VL-Lora-Model
可以成功Lora微调的Qwen-VL模型
☆16Updated last year
pprp / GoodsRecognition.MindSpore
基于MindSpore AI框架实现零售商品识别 top1方案
☆45Updated 3 years ago
chineseocr / ai-medical
陆续开源医疗行业的深度学习模型及数据集
☆13Updated 3 years ago
zstar1003 / PaddleOCR-Torch-Infer
从MinerU中提取出来的文本检测识别部分，通过pytorch实现paddleocr的文本检测识别
☆17Updated 4 months ago
wjc852456 / ONNX-TensorRT-LibTorch
deploy onnx models with TensorRT and LibTorch
☆19Updated 3 years ago
GuoYiFantastic / IMelodist
Music large model based on InternLM2-chat.
☆22Updated 9 months ago
xiteng01 / CVPR2023_foundation_model_Track1
Workshop on Foundation Model 1st foundation model challenge Track1 codebase (Open TransMind v1.0)
☆18Updated 2 years ago
thomaszheng / OcrLiteMnn
ChineseOcr Lite Mnn，超轻量级中文OCR PC Demo，使用MNN推理
☆27Updated 4 years ago
360CVGroup / 360VL
Our 2nd-gen LMM
☆34Updated last year