princetonvisualai / mervLinks
Unifying Specialized Visual Encoders for Video Language Models
☆23Updated 3 weeks ago
Alternatives and similar repositories for merv
Users that are interested in merv are comparing it to the libraries listed below
Sorting:
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆94Updated 9 months ago
- The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]☆20Updated 9 months ago
- Test-Time Training on Video Streams