[NeurIPS 2025] 𝓡𝓣𝓥-𝓑𝓮𝓷𝓬𝓱: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video.
☆33Jan 15, 2026Updated 4 months ago
Alternatives and similar repositories for RTV-Bench
Users that are interested in RTV-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.☆182May 5, 2026Updated 3 weeks ago
- [CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?☆144Jul 24, 2025Updated 10 months ago
- [ICLR'2025 Spotlight] Official repository for "SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding"☆88Nov 23, 2025Updated 6 months ago
- ☆32Jul 29, 2024Updated last year
- [NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"☆73Feb 26, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A real-time video understanding foundation model built on Llama-3.2-Vision, featuring comprehensively extended video processing and multi…☆139Apr 13, 2026Updated last month
- [Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.☆125Jul 27, 2024Updated last year
- ☆13May 17, 2025Updated last year
- [AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…☆66Jan 27, 2026Updated 4 months ago
- [TPAMI2025] BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors☆16Apr 23, 2025Updated last year
- ☆26Apr 26, 2025Updated last year
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆56Mar 9, 2025Updated last year
- ☆12Apr 12, 2026Updated last month
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Open-source strong baseline for domain generlization re-ID. We will udpate the strong baseline and CFD method~☆10Nov 30, 2021Updated 4 years ago
- Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding☆68Apr 29, 2026Updated last month
- ☆25Jul 20, 2025Updated 10 months ago
- Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM | EMNLP 2025 Findings☆18Oct 17, 2025Updated 7 months ago
- A simple video streaming baseline that outperforms SOTAs.☆132May 1, 2026Updated 3 weeks ago
- [CVPR'25] AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data☆17Mar 27, 2025Updated last year
- CLiC: Concept Learning in Context☆10Jan 24, 2025Updated last year
- Official implementation of "In-style: Bridging Text and Uncurated Videos with Style Transfer for Cross-modal Retrieval." ICCV 2023☆11Oct 5, 2023Updated 2 years ago
- ☆14Dec 12, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ICCV23 "Householder Projector for Unsupervised Latent Semantics Discovery"☆17Jun 26, 2025Updated 11 months ago
- [Paper] SoMoFormer: Multi-Person Pose Forecasting with Transformers☆27Mar 1, 2023Updated 3 years ago
- CS194-196 Course Project☆14Feb 20, 2025Updated last year
- [ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.☆91Dec 24, 2025Updated 5 months ago
- Exercise solver to ML in coursera☆11Jan 31, 2023Updated 3 years ago
- Official PyTorch implementation Source code for Weakly Supervised Video Scene Graph Generation via Natural Language Supervision, accepted…☆24Jun 13, 2025Updated 11 months ago
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆23May 7, 2026Updated 3 weeks ago
- [ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval☆120Nov 4, 2025Updated 6 months ago
- Paper Reading of IMCC groups.☆17Oct 22, 2025Updated 7 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Building a quick conversation-based search demo with langchain.☆10Apr 2, 2024Updated 2 years ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated last year
- Benchmark for agentic spatial data analysis☆29Apr 29, 2026Updated last month
- Code Implementation for AutoAttend: Automated Attention Representation Search☆11Jul 26, 2021Updated 4 years ago
- [TMLR'26] UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models☆54May 17, 2026Updated last week
- The official GitHub page for the survey paper "A Survey on LLM Symbolic Reasoning". And this paper is under review.☆34Updated this week
- [NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation☆19Dec 22, 2024Updated last year