taishan1994 / MiniClipLinks
动手训练一个简单的CLIP模型,加深对CLIP的理解。
☆21Updated 6 months ago
Alternatives and similar repositories for MiniClip
Users that are interested in MiniClip are comparing it to the libraries listed below
Sorting:
- 多模态 MM +Chat 合集☆279Updated 3 months ago
- A toolbox of yolo models and algorithms based on MindSpore☆168Updated last week
- Building a VLM model starts from the basic module.☆18Updated last year
- 将SmolVLM2的视觉头与Qwen3-0.6B模型进行了拼接微调☆465Updated 3 months ago
- ☆30Updated last year
- 一些大语言模型和多模态模型的生态,主要包括跨模态搜索、投机解码、QAT量化、多模态量化、ChatBot、OCR☆194Updated 4 months ago
- README.md☆48Updated 2 years ago
- A toolbox of vision models and algorithms based on MindSpore☆264Updated 4 months ago
- This project showcases the deployment of the RT-DETR model using ONNXRUNTIME in C++ and Python.☆58Updated 2 years ago
- Pytorch分布式训练框架☆84Updated last week
- ☆70Updated 2 years ago
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆107Updated last year
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆128Updated 6 months ago
- Official implementation of RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models☆162Updated 2 weeks ago
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆77Updated last year
- ☆23Updated last year
- ☆78Updated 6 months ago
- YOLOv10 implement with mmyolo☆43Updated last year
- 这是一个clip-pytorch的模型,可以训练自己的数据集。☆247Updated 2 years ago
- Fine tuning grounding Dino☆150Updated 4 months ago
- Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection☆93Updated 9 months ago
- Research Code for Multimodal-Cognition Team in Ant Group☆169Updated 2 months ago
- 根据Open-GroundingDino代码训练自己的数据集,记录复现过程☆31Updated 10 months ago
- ☆20Updated last year
- Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖☆48Updated last year
- ☆186Updated last year
- YOLO格式转为COCO格式。Convert data format from YOLO format to coco format☆13Updated 2 years ago
- ☆33Updated 9 months ago
- The source code of IEEE TPAMI 2025 "Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation".☆116Updated last year
- 这是一个不基于任何框架实现的从0到1的VLM finetune(包括Pre-train和SFT)☆35Updated 3 months ago