SliMM-X / CoMP-MM
Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"
☆17Updated this week
Alternatives and similar repositories for CoMP-MM:
Users that are interested in CoMP-MM are comparing it to the libraries listed below
- Official project page of "HiMix: Reducing Computational Complexity in Large Vision-Language Models"☆10Updated 2 months ago
- This is the official repo for ByteVideoLLM/Dynamic-VLM☆20Updated 3 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆34Updated last month
- The official repository for paper "PruneVid: Visual Token Pruning for Efficient Video Large Language Models".☆35Updated last month
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆35Updated 9 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆35Updated last month
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆37Updated 3 months ago
- ☆56Updated last week
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆42Updated last week
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆16Updated 5 months ago
- Official Repository of Personalized Visual Instruct Tuning