The code for "VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by VIdeo SpatioTemporal Augmentation" [CVPR2025]
☆21Feb 27, 2025Updated last year
Alternatives and similar repositories for VISTA
Users that are interested in VISTA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- More reliable Video Understanding Evaluation☆15Sep 23, 2025Updated 7 months ago
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆39Feb 1, 2026Updated 2 months ago
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers" [ICCV 2025]☆101Jul 28, 2025Updated 9 months ago
- ☆37Sep 16, 2024Updated last year
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆56Mar 9, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆37Updated this week
- DreamDance: Personalized Text-to-video Generation by Combining Text-to-Image Synthesis and Motion Transfer☆14Dec 16, 2022Updated 3 years ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆36Jul 15, 2025Updated 9 months ago
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]