LJungang/RTV-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LJungang/RTV-Bench)

LJungang / RTV-Bench

[NeurIPS 2025] 𝓡𝓣𝓥-𝓑𝓮𝓷𝓬𝓱: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video.

☆33

Alternatives and similar repositories for RTV-Bench

Users that are interested in RTV-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CVC2233 / AndroTMem
View on GitHub
AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents
☆25Jul 5, 2026Updated 3 weeks ago
KYRIE-LI11 / VideoMark
View on GitHub
☆23Aug 23, 2025Updated 11 months ago
dingyue772 / OmniSIFT
View on GitHub
[ICML2026] OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models
☆26May 21, 2026Updated 2 months ago
LJungang / Awesome-Video-Reasoning-Landscape
View on GitHub
🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.
☆190Jun 14, 2026Updated last month
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆155Jul 24, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Mark12Ding / Dispider
View on GitHub
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
☆180Mar 23, 2025Updated last year
JavisVerse / JavisGPT
View on GitHub
[NeurIPS'25 Spotlight] Official implementation of "JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation"
☆75Feb 26, 2026Updated 5 months ago
OmniMMI / M4
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆19Apr 2, 2025Updated last year
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆45Feb 5, 2025Updated last year
shiningwhite-cmd / VLA-mark
View on GitHub
☆25Sep 15, 2025Updated 10 months ago
THU-BPM / MarkDiffusion
View on GitHub
[JMLR] MarkDiffusion: An Open-Source Toolkit for Generative Watermarking of Latent Diffusion Models
☆324Jul 11, 2026Updated 2 weeks ago
LaVi-Lab / Rethink_CoT_Video
View on GitHub
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
☆21Dec 14, 2025Updated 7 months ago
lxa9867 / QSD
View on GitHub
[CVPR 2024] "Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition"
☆12Feb 27, 2024Updated 2 years ago
Share14 / ShareGemini
View on GitHub
☆32Jul 29, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
alibaba / ReWatch-R1
View on GitHub
[ICLR 2026] ReWatch-R1: Boosting Complex Video Reasoning in Large Vision-Language Models through Agentic Data Synthesis
☆30Mar 27, 2026Updated 4 months ago
sosppxo / RG-SAN
View on GitHub
[NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
☆20Dec 22, 2024Updated last year
Espere-1119-Song / Video-MMLU
View on GitHub
A Massive Multi-Discipline Lecture Understanding Benchmark
☆34Apr 20, 2026Updated 3 months ago
Leon1207 / 3DRefTR
View on GitHub
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
☆26Aug 24, 2023Updated 2 years ago
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated last month
jing-bi / awesome-M.LLM-reasoning
View on GitHub
☆20May 11, 2025Updated last year
longvideobench / LongVideoBench
View on GitHub
[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.
☆134Jul 27, 2024Updated 2 years ago
pro-assist / ProAssist
View on GitHub
☆20Jul 21, 2025Updated last year
KD-TAO / OmniZip
View on GitHub
[CVPR 2026] OmniZip: Audio-Guided Dynamic Token Compression for Fast Omnimodal Large Language Models
☆102Apr 20, 2026Updated 3 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
NVlabs / FRAG
View on GitHub
☆15Apr 25, 2025Updated last year
Sammy20207109 / DyCo-RL
View on GitHub
DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning
☆18Jun 14, 2026Updated last month
rlqja1107 / NL-VSGG
View on GitHub
Official PyTorch implementation Source code for Weakly Supervised Video Scene Graph Generation via Natural Language Supervision, accepted…
☆25Jun 13, 2025Updated last year
IVGSZ / Flash-VStream
View on GitHub
This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"
☆287Oct 15, 2025Updated 9 months ago
jindongli-Ai / LLM-Symbolic-Reasoning-Survey
View on GitHub
The official GitHub page for the survey paper "A Survey on LLM Symbolic Reasoning". And this paper is under review.
☆39May 28, 2026Updated 2 months ago
kaiw7 / STG-CMA
View on GitHub
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation
☆15Apr 13, 2024Updated 2 years ago
hurunyi / VideoShield
View on GitHub
[ICLR 2025] VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking (Official Implementation)
☆56May 30, 2025Updated last year
SitongGong / Veason-R1
View on GitHub
Official code of Veason-R1
☆15Jul 14, 2026Updated 2 weeks ago
ldzhangyx / loop-copilot
View on GitHub
☆12Oct 20, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
sanxing-chen / HittER
View on GitHub
Codebase for the EMNLP 2021 paper "HittER: Hierarchical Transformers for Knowledge Graph Embeddings".
☆12Nov 1, 2021Updated 4 years ago
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆68Jan 27, 2026Updated 6 months ago
CASIA-IVA-Lab / VideoNIAH
View on GitHub
VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs
☆57Mar 9, 2025Updated last year
jokieleung / I-MCTS
View on GitHub
Code for EACL 26 Findings paper "I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search"
☆13Jan 28, 2026Updated 6 months ago
google-research-datasets / egotempo
View on GitHub
☆26Jun 19, 2026Updated last month
skhcjh231 / MATR_codebase
View on GitHub
☆22Mar 7, 2025Updated last year
Go2Heart / StreamFormer
View on GitHub
[ICCV 2025 Oral] Official implementation of Learning Streaming Video Representation via Multitask Training.
☆93Jul 22, 2026Updated last week