This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training data, instruction fine-tuning data, and In-Context learning data.
☆68May 7, 2025Updated 10 months ago
Alternatives and similar repositories for Awesome-MLLM-Datasets
Users that are interested in Awesome-MLLM-Datasets are comparing it to the libraries listed below
Sorting:
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- 🔨🔨🔨Tool for making model training data set☆20Nov 1, 2024Updated last year
- ☆11May 17, 2024Updated last year
- 基于LLaVA1.6微调的Xray识别的多模态大模型☆10Oct 22, 2024Updated last year
- Multi-Task instruction-tuned LLaMA☆14May 5, 2023Updated 2 years ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- ☆25Feb 2, 2025Updated last year
- Web application for real-time object detection 🔎 using Flask 🌶, OpenCV, and YoloV3 weights. It uses the COCO Dataset 🖼.☆16Apr 19, 2021Updated 4 years ago
- This repo offers advanced tutorials for LLMs, BERT-based models, and multimodal models, covering fine-tuning, quantization, vocabulary ex…☆24May 5, 2025Updated 10 months ago
- Awesome paper for multi-modal llm with grounding ability☆19Oct 11, 2025Updated 4 months ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆49Aug 27, 2023Updated 2 years ago
- Using convolutional neural networks for the 2019 Kidney and Kidney Tumor Segmentation Challenge☆19Dec 13, 2019Updated 6 years ago
- ☆20Jan 6, 2023Updated 3 years ago
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆51Feb 23, 2026Updated 2 weeks ago
- DEYOv1.5☆29Jul 22, 2024Updated last year
- ☆32Nov 15, 2022Updated 3 years ago