wkcn / TinyCLIP
[ICCV2023] TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
☆66Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for TinyCLIP
- Zero-label image classification via OpenCLIP knowledge distillation☆112Updated last year
- ☆148Updated last month
- 【ECCV2024】The official repo of Griffon series☆102Updated this week
- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed☆54Updated 2 weeks ago
- Training LLaMA language model with MMEngine! It supports LoRA fine-tuning!☆40Updated last year
- This repo contains the code and data for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks"☆59Updated this week
- InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024☆73Updated 7 months ago
- ☆103Updated 3 months ago
- A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.☆63Updated 3 weeks ago
- [CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"☆173Updated 5 months ago
- [EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner☆110Updated this week
- mllm-npu: training multimodal large language models on Ascend NPUs☆83Updated 2 months ago
- ☆99Updated 4 months ago
- One summary of efficient segment anything models☆71Updated 3 months ago
- 【NeurIPS 2024】Dense Connector for MLLMs☆133Updated 3 weeks ago
- Recognize Any Regions☆118Updated last month
- ☆77Updated last year
- Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models☆186Updated 9 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆51Updated last week
- Making LLaVA Tiny via MoE-Knowledge Distillation☆55Updated 2 weeks ago
- ☆86Updated 4 months ago
- This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model☆90Updated 3 months ago
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆137Updated this week
- An open-source implementaion for fine-tuning Qwen2-VL series by Alibaba Cloud.☆96Updated this week
- DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception☆115Updated last month
- Lion: Kindling Vision Intelligence within Large Language Models☆53Updated 9 months ago
- [ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"☆188Updated 2 months ago
- [ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection☆95Updated last year
- Distilling the powerful segment anything models into lightweight ones for efficient segmentation.☆29Updated last year