PaddlePaddle / PaddleMIXLinks
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
☆675Updated this week
Alternatives and similar repositories for PaddleMIX
Users that are interested in PaddleMIX are comparing it to the libraries listed below
Sorting:
- ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resou…☆369Updated 11 months ago
- This is a user guide for the MiniCPM and MiniCPM-V series of small language models (SLMs) developed by ModelBest. “面壁小钢炮” focuses on achi…☆263Updated last month
- Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)☆622Updated 7 months ago
- 将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调☆128Updated this week
- 一些大语言模型和多模态模型的生态,主要包括跨模态搜索、投机解码、QAT量化、多模态量化、ChatBot、OCR☆185Updated last week
- huggingface mirror download☆584Updated 4 months ago
- ☆104Updated last year
- [ACL 2025 Oral] 🔥🔥 MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval☆210Updated 2 months ago
- 支持中英文双语视觉-文本对话的开源可商用多模态模型。☆373Updated last year
- PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具☆107Updated this week
- 多模态中文LLaMA&Alpaca大语言模型(VisualCLA)☆450Updated 2 years ago
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆1,003Updated this week
- 通义千问VLLM推理部署DEMO☆592Updated last year
- ☆64Updated last year
- ☆325Updated last week
- GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.☆960Updated 2 weeks ago
- GOT的vLLM加速实现 并结合 MinerU 实现RAG中的pdf 解析☆61Updated 8 months ago
- ☆235Updated 5 months ago
- ☆102Updated 2 years ago
- Yuan 2.0 Large Language Model☆689Updated last year
- A Multi-modal RAG Project with Dataset from Honor of Kings, one of the most popular smart phone games in China☆66Updated 11 months ago
- ☆170Updated this week
- 基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。☆340Updated 2 weeks ago
- 视觉预训练基础模型仓库☆499Updated 2 years ago
- [ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列☆1,063Updated last year
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆245Updated 5 months ago
- 基于《西游记》原文、白话文、ChatGPT生成数据制作的,以InternLM2微调的角色扮演多LLM聊天室。 本项目将介绍关于角色扮演类 LLM 的一切,从数据获取、数据处理,到使用 XTuner 微调并部署至 OpenXLab,再到使用 LMDeploy 部署,以 op…☆102Updated last year
- Llama3-Tutorial(XTuner、LMDeploy、OpenCompass)☆511Updated last year
- 中文文生图stable diffsion模型集合☆347Updated last month
- Analysis of Chinese and English layouts 中英文版面分析☆234Updated 3 weeks ago