Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
☆285Apr 8, 2026Updated this week
Alternatives and similar repositories for Video-MME-v2
Users that are interested in Video-MME-v2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy☆306May 14, 2025Updated 10 months ago
- ☆29Jun 17, 2024Updated last year
- ☆10Jul 28, 2022Updated 3 years ago
- ✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis☆746Dec 8, 2025Updated 4 months ago
- A curated list of papers and resources for text-to-image evaluation.☆30Sep 6, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Unlearnable Examples Give a False Sense of Security: Piercing through Unexploitable Data with Learnable Examples☆11Oct 14, 2024Updated last year
- Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"☆23Nov 1, 2025Updated 5 months ago
- (ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆59Jan 26, 2026Updated 2 months ago
- This repository is the replication package of the NeurIPS19 paper "MarginGAN: Adversarial Training in Semi-Supervised Learning"☆12Oct 27, 2019Updated 6 years ago
- ☆19Jun 29, 2025Updated 9 months ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆374May 27, 2025Updated 10 months ago
- A video question answering dataset that focuses on the dynamics properties of objects (velocity, acceleration) and their collisions withi…☆19Apr 23, 2025Updated 11 months ago
- [ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs☆82Jan 17, 2026Updated 2 months ago
- A simple implementation of ReasonGenRM.☆19Apr 21, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The official implement of Freeze-Omni.☆15Jul 10, 2025Updated 8 months ago
- ☆12Sep 11, 2021Updated 4 years ago
- A LaTeX template for presentations in the official IEE design.☆17Apr 1, 2026Updated last week
- Code for CVPR 2022 paper "NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition"☆25Jun 14, 2022Updated 3 years ago
- Official code of *Towards Event-oriented Long Video Understanding*☆12Jul 26, 2024Updated last year
- This is a Pytorch implementation of contrastive Learning(CL) baselines.☆14Aug 29, 2022Updated 3 years ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆40Mar 25, 2026Updated 2 weeks ago
- Github repo for One-shot Neural Backdoor Erasing via Adversarial Weight Masking (NeurIPS 2022)☆15Jan 3, 2023Updated 3 years ago
- ☆22Feb 13, 2026Updated last month
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- 微调阿里开源的文字检测模型,利用合合识别返回的OCR结果作为初始训练数据,对模型进行优化训练,使其更加适应1万张图片的具体场景,提高文字识别的精度。☆10Dec 9, 2024Updated last year
- ☆18Oct 7, 2022Updated 3 years ago
- QT设计的图书管理系统☆10May 8, 2019Updated 6 years ago
- [NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models☆54Feb 22, 2026Updated last month
- Offical implemention of Robust Superpixel-Guided Attentional Adversarial Attack (CVPR2020)☆10Jan 5, 2022Updated 4 years ago
- Vision Large Language Models trained on M3IT instruction tuning dataset☆17Aug 16, 2023Updated 2 years ago
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆27Dec 11, 2025Updated 3 months ago
- Code of paper [CVPR'24: Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?]☆25Apr 2, 2024Updated 2 years ago
- ☆42Feb 12, 2026Updated last month
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Official code for "VideoReward Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning"☆45Oct 20, 2025Updated 5 months ago
- official impelmentation of Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input☆67Aug 30, 2024Updated last year
- 给科研小白的一些资源与工具推荐☆17Jul 6, 2020Updated 5 years ago
- 东南大学 2021 级计算机专业操作系统课程实验 - Operating System Labwork source code in Dr.Kai Dong's Operating System Class. Based on OSTEP.☆13Jun 17, 2023Updated 2 years ago
- Website for MathVista☆21Jun 9, 2025Updated 9 months ago
- [CVPR 2024] official code for SimAC☆21Jan 23, 2025Updated last year
- Official eval code for ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation☆27Dec 12, 2025Updated 3 months ago