codefuse-ai / CodeFuse-MFT-VLMLinks
☆39Updated 11 months ago
Alternatives and similar repositories for CodeFuse-MFT-VLM
Users that are interested in CodeFuse-MFT-VLM are comparing it to the libraries listed below
Sorting:
- ☆79Updated last year
- GLM Series Edge Models☆149Updated 3 months ago
- ☆177Updated 7 months ago
- The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.☆246Updated 2 weeks ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆251Updated last month
- Our 2nd-gen LMM☆34Updated last year
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆223Updated 3 months ago
- Mixture-of-Experts (MoE) Language Model☆190Updated last year
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆70Updated last year
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆62Updated 10 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆126Updated 10 months ago
- SUS-Chat: Instruction tuning done right☆49Updated last year
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- ☆92Updated 2 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆137Updated last year
- LongQLoRA: Extent Context Length of LLMs Efficiently☆166Updated last year
- ☆249Updated last year
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆154Updated last year
- ☆57Updated last year
- [ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"☆229Updated 5 months ago
- X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages☆314Updated 2 years ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆193Updated last year
- Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions (NeurIPS 2024)☆166Updated last year
- Research Code for Multimodal-Cognition Team in Ant Group☆165Updated 2 months ago
- Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊☆269Updated 7 months ago
- ☆231Updated last year
- FlagEval is an evaluation toolkit for AI large foundation models.☆337Updated 4 months ago
- ☆211Updated last year
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆22Updated last year