codefuse-ai / CodeFuse-MFT-VLMLinks
☆41Updated last year
Alternatives and similar repositories for CodeFuse-MFT-VLM
Users that are interested in CodeFuse-MFT-VLM are comparing it to the libraries listed below
Sorting:
- ☆79Updated last year
- ☆187Updated last year
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆65Updated last year
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆241Updated 3 months ago
- The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.☆284Updated 4 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆270Updated 2 weeks ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆130Updated last year
- ☆114Updated last month
- Research Code for Multimodal-Cognition Team in Ant Group☆172Updated 3 months ago
- GLM Series Edge Models☆158Updated 7 months ago
- ☆29Updated last year
- ☆75Updated last year
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆105Updated 8 months ago
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101Updated last year
- Mixture-of-Experts (MoE) Language Model☆195Updated last year
- a toolkit on knowledge distillation for large language models☆266Updated this week
- ☆218Updated last year
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆100Updated last year
- Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.☆270Updated 2 years ago
- MM-Interleaved: Interleaved Image-Text Generative Modeling via Multi-modal Feature Synchronizer☆248Updated last year
- [CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness☆442Updated 8 months ago
- The code of the paper "NExT-Chat: An LMM for Chat, Detection and Segmentation".☆252Updated 2 years ago
- ☆72Updated 2 years ago
- [COLM 2025] Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources☆306Updated 5 months ago
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆122Updated last year
- ☆254Updated 2 years ago
- [CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback☆306Updated last year
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated last year
- ☆57Updated 2 years ago
- MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆215Updated 4 months ago