mbzuai-oryx / VideoGPT-plusView external linksLinks
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
β292Aug 5, 2025Updated 6 months ago
Alternatives and similar repositories for VideoGPT-plus
Users that are interested in VideoGPT-plus are comparing it to the libraries listed below
Sorting:
- [ACL 2024 π₯] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capβ¦β1,488Aug 5, 2025Updated 6 months ago
- VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMsβ1,277Jan 23, 2025Updated last year
- PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Modelsβ261Aug 5, 2025Updated 6 months ago
- Official repository for the paper PLLaVAβ676Jul 28, 2024Updated last year
- β32Jul 29, 2024Updated last year
- Long Context Transfer from Language to Visionβ398Mar 18, 2025Updated 10 months ago
- Mobile-VideoGPT: Fast and Accurate Video Understanding Language Modelβ133Aug 6, 2025Updated 6 months ago
- [CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understandingβ409May 8, 2025Updated 9 months ago
- β107Jul 30, 2024Updated last year
- πΎ E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)β74Jan 20, 2025Updated last year
- Official InfiniBench: A Benchmark for Large Multi-Modal Models in Long-Form Movies and TV Showsβ19Nov 4, 2025Updated 3 months ago
- (2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding