Long Context Transfer from Language to Vision
☆402Mar 18, 2025Updated last year
Alternatives and similar repositories for LongVA
Users that are interested in LongVA are comparing it to the libraries listed below
Sorting:
- ☆157Oct 31, 2024Updated last year
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆213Jan 6, 2025Updated last year
- 🔥🔥MLVU: Multi-task Long Video Understanding Benchmark☆242Aug 21, 2025Updated 7 months ago
- VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs☆1,284Jan 23, 2025Updated last year
- ☆4,607Sep 14, 2025Updated 6 months ago
- 🔥🔥First-ever hour scale video understanding models☆616Jul 14, 2025Updated 8 months ago
- ✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis☆732Dec 8, 2025Updated 3 months ago
- Official repository for the paper PLLaVA