Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
☆715Feb 3, 2026Updated last month
Alternatives and similar repositories for PaddleMIX
Users that are interested in PaddleMIX are comparing it to the libraries listed below
Sorting:
- PaddlePaddle Code Convert Toolkit. 『飞桨』深度学习代码转换工具☆124Updated this week
- Paddle Automatically Diff Precision Toolkits.☆53Dec 5, 2025Updated 2 months ago
- High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle☆3,653Updated this week
- 🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv10, YOLO11, YOLOX, YOLOv5u, Y…☆659Jan 14, 2026Updated last month
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆306Sep 10, 2024Updated last year
- Easy-to-use and powerful LLM and SLM library with awesome model zoo.☆12,916Dec 17, 2025Updated 2 months ago
- ERNIE Bot Agent is a Large Language Model (LLM) Agent Framework, powered by the advanced capabilities of ERNIE Bot and the platform resou…☆377Aug 20, 2024Updated last year
- Multimodal chatbot with computer vision capabilities integrated, our 1st-gen LMM☆101May 17, 2024Updated last year
- All-in-One Development Tool based on PaddlePaddle☆6,049Updated this week
- 飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。☆477May 24, 2024Updated last year
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,820Apr 9, 2025Updated 10 months ago
- ☆269Nov 20, 2025Updated 3 months ago
- 飞桨智能标注,让标注快人一步☆294Nov 25, 2024Updated last year
- 整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…☆924Aug 3, 2025Updated 7 months ago
- ONNX Model Exporter for PaddlePaddle☆901Jan 13, 2026Updated last month
- 基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。☆410Sep 4, 2025Updated 5 months ago
- 视觉预训练基础模型仓库☆501Apr 12, 2023Updated 2 years ago
- PaddleSlim is an open-source library for deep model compression and architecture search.☆1,612Jan 4, 2026Updated last month
- A 3D computer vision development toolkit based on PaddlePaddle. It supports point-cloud object detection, segmentation, and monocular 3D …☆634Apr 22, 2025Updated 10 months ago
- Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)☆1,947Jan 24, 2026Updated last month
- Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-ti…☆14,093Feb 13, 2026Updated 2 weeks ago
- [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型☆9,836Sep 22, 2025Updated 5 months ago
- An experimental project for paddle python IR.☆15Dec 4, 2023Updated 2 years ago
- Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based a…☆1,679Feb 12, 2025Updated last year
- PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-…☆287Aug 1, 2023Updated 2 years ago
- Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL,…☆12,820Updated this week
- LMDeploy is a toolkit for compressing, deploying, and serving LLMs.☆7,618Feb 24, 2026Updated last week
- Deep learning model converter for PaddlePaddle. (『飞桨』深度学习模型转换工具)☆769Oct 22, 2025Updated 4 months ago
- 深度学习入门课、资深课、特色课、学术案例、产业实践案例、深度学习知识百科及面试题库The course, case and knowledge of Deep Learning and AI☆3,578Jul 25, 2024Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,089Feb 10, 2025Updated last year
- ☆15Jan 7, 2022Updated 4 years ago
- Qianfan-VL: Domain-Enhanced Universal Vision-Language Models☆181Sep 22, 2025Updated 5 months ago
- PaddlePaddle Developer Community☆135Updated this week
- paddle code convert toolkit☆22Mar 19, 2023Updated 2 years ago
- 文档方向分类☆222Feb 3, 2026Updated last month
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,430Mar 3, 2025Updated last year
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆2,017Apr 14, 2025Updated 10 months ago
- On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)☆796Jul 5, 2025Updated 7 months ago
- a plugin-oriented framework for video structured. 国产程序员请加微信zhzhi78拉群交流。☆18May 28, 2024Updated last year