一个强大的 多模态大语言模型(MLLM),支持 文本、图像、视频等多模态输入,具备强大的理解、推理和生成能力。
☆23Mar 19, 2025Updated 11 months ago
Alternatives and similar repositories for MUG-U
Users that are interested in MUG-U are comparing it to the libraries listed below
Sorting:
- 纯Python实现的深度学习框架,帮助你理解底层细节斩获offer☆21Aug 26, 2022Updated 3 years ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 7 months ago
- Automatically generates captions for an image using Image processing and NLP. Model was trained on Flickr30K dataset.☆11Jun 11, 2020Updated 5 years ago
- ☆11May 24, 2024Updated last year
- Instance-Level Salient Object Detection, Computer Vision and Image Understanding (CVIU), 2021.☆12Apr 23, 2021Updated 4 years ago
- ☆33Jan 9, 2026Updated last month
- ☆12Feb 27, 2025Updated last year
- This repository is an official implementation of the paper A Simple Baseline for Open-World Tracking via Self-training.☆10Jan 26, 2024Updated 2 years ago
- ☆29Feb 12, 2026Updated 3 weeks ago
- [NAACL 2025] Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning☆12Feb 9, 2025Updated last year
- BathML's parking prediction project (2016-18)☆10Mar 11, 2018Updated 7 years ago
- 一个小小的书单,收集整理了一些计算机科学与技术方面的书籍英文原著pdf。☆10Jan 13, 2022Updated 4 years ago
- 冯如杯项目-嵌入式实时对象检测系统☆10Mar 20, 2019Updated 6 years ago
- ☆30Updated this week
- ☆14Apr 25, 2023Updated 2 years ago
- Tool to parse wiki tables from the HTML dump of Wikipedia☆11Jun 12, 2022Updated 3 years ago
- ☆14Oct 21, 2024Updated last year
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆61Jul 26, 2024Updated last year
- Paper_Reading☆16Oct 25, 2021Updated 4 years ago
- Extended message for darknet_ros_msgs☆15Jul 7, 2022Updated 3 years ago
- Code accompanying our NeurIPS 2020 traffic4cast challenge☆14Oct 4, 2021Updated 4 years ago
- An implementation of DSOD in Pytonch☆15Jul 13, 2018Updated 7 years ago
- 一个基于内容的图像检索系统☆14Aug 19, 2022Updated 3 years ago
- ☆24Jun 18, 2025Updated 8 months ago
- My own C++ additional examples using DepthAI☆15May 23, 2021Updated 4 years ago
- we propose a novel and efficient cross-domain human parsing model to bridge the cross-domain differences in terms of visual appearance an…☆15Jan 9, 2018Updated 8 years ago
- The official codes for Fast Monte Carlo Rendering via Multi-Resolution Sampling☆16Dec 2, 2021Updated 4 years ago
- Real Time Object Detection By Using YOLO to online shopping☆11Mar 17, 2019Updated 6 years ago
- A python tool that generate latex(e.g. Table, matrix) code.☆10Jun 22, 2022Updated 3 years ago
- ☆12Jan 19, 2025Updated last year
- This project is an attempt at performing color quantization using K-Means clustering. We also add our own touch by trying a different ini…☆15Jul 31, 2020Updated 5 years ago
- Video Panoptic Segmentation☆16Jun 19, 2020Updated 5 years ago
- Efficient Pose Machine for Multi-Person Pose Estimation☆15Dec 20, 2019Updated 6 years ago
- an end-to-end instance-segmentation framework inspired by YOLO and mask R-CNN☆13Nov 22, 2019Updated 6 years ago
- Laplacian-Pyramid-Reconstruction-and-Refinement-for-Semantic-Segmentation in Pytorch☆12Nov 3, 2018Updated 7 years ago
- ☆14Dec 27, 2016Updated 9 years ago
- MAT: Mask-Aware Transformer for Large Hole Image Inpainting☆17Apr 1, 2022Updated 3 years ago
- LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)☆18May 10, 2023Updated 2 years ago
- NVIDIA GPU Accelerated Application Samples in Google Cloud Platform☆21Feb 21, 2026Updated last week