zhangfaen / finetune-InternVL2
☆13Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for finetune-InternVL2
- Building a VLM model starts from the basic module.☆10Updated 7 months ago
- 可以成功Lora微调的Qwen-VL模型☆16Updated last year
- ☆55Updated 9 months ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 2 months ago
- Chinese CLIP models with SOTA performance.☆48Updated last year
- ☆30Updated 6 months ago
- Here is a demo for PDF parser (Including OCR, object detection tools)☆30Updated last month
- Workshop on Foundation Model 1st foundation model challenge Track1 codebase (Open TransMind v1.0)☆18Updated last year
- ☆22Updated last month
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆45Updated 5 months ago
- Bert TensorRT模型加速部署☆9Updated 2 years ago
- ☆15Updated 8 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆35Updated last month
- Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖☆18Updated 5 months ago
- chinese document classification of layoutlmv3 and layoutxlm☆41Updated 2 years ago
- ☆77Updated 6 months ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆69Updated 2 months ago
- 基于baichuan-7b的开源多模态大语言模型☆72Updated 11 months ago
- 个人项目地址,一些大语言模型和多模态模型的应用☆123Updated 2 weeks ago
- A light proxy solution for HuggingFace hub.☆44Updated last year
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆51Updated 3 weeks ago
- 本项目使用LLaVA 1.6多模态模型实现以文搜图和以图搜图功能。☆17Updated 8 months ago
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆68Updated 2 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 7 months ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆26Updated last month
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆55Updated 2 months ago
- Research Code for Multimodal-Cognition Team in Ant Group☆123Updated 4 months ago
- 补充了一些Visualglm缺少的文件,可以对Visualglm进行训练,实例中是对人脸做了面相的识别☆12Updated last year
- ☆66Updated last year