sotayang/Awesome-Streaming-Video-Understanding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sotayang/Awesome-Streaming-Video-Understanding)

sotayang / Awesome-Streaming-Video-Understanding

🔥🔥🔥 [Awesome] Latest Papers, Codes & Datasets on Streaming / Online Video Understanding — Building Always-on, Real-time Video AI 🤖

☆419

Alternatives and similar repositories for Awesome-Streaming-Video-Understanding

Users that are interested in Awesome-Streaming-Video-Understanding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ydyhello / Awesome-VLM-Streaming-Video
View on GitHub
📚 A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for str…
☆189Jul 22, 2026Updated last week
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆83Feb 25, 2026Updated 5 months ago
haowei-freesky / HERMES
View on GitHub
Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]
☆93May 8, 2026Updated 2 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆155Jul 24, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆153May 1, 2026Updated 2 months ago
lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆70Jun 8, 2026Updated last month
aurateam2026 / AURA
View on GitHub
☆118Jun 5, 2026Updated last month
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆133Nov 4, 2025Updated 8 months ago
EIT-NLP / StreamingLLM
View on GitHub
Repository of Streaming LLMs
☆91Updated this week
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated last month
LJungang / Awesome-Video-Reasoning-Landscape
View on GitHub
🔥An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.
☆190Jun 14, 2026Updated last month
mit-han-lab / streaming-vlm
View on GitHub
StreamingVLM: Real-Time Understanding for Infinite Video Streams
☆1,048Oct 15, 2025Updated 9 months ago
sotayang / LiveStar
View on GitHub
[NeurIPS'2025] Official repository for "LiveStar: Live Streaming Assistant for Real-World Online Video Understanding"
☆155Jul 3, 2026Updated 3 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
CASIA-IVA-Lab / ThinkStream
View on GitHub
☆41Jun 18, 2026Updated last month
SooLab / EyeWO
View on GitHub
[NeurIPS2025] The official PyTorch implementation of the "Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video".
☆35Dec 25, 2025Updated 7 months ago
aiha-lab / InfiniPot-V
View on GitHub
[NeurIPS 25] InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
☆20Jan 25, 2026Updated 6 months ago
ShareLab-SII / FluxMem
View on GitHub
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
☆74Mar 16, 2026Updated 4 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated last month
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
yellow-binary-tree / MMDuet2
View on GitHub
[ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
☆42Jan 14, 2026Updated 6 months ago
air-embodied-brain / Em-Garde
View on GitHub
Implementation of Em_Garde: a proposal-retrieval framework for streaming video understanding
☆26Jun 24, 2026Updated last month
apple / ml-streambridge
View on GitHub
☆40Nov 5, 2025Updated 8 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
reneverland / CBIT-AiExam-plus
View on GitHub
A general‑purpose AI‑powered examination platform for schools, training providers, enterprises, and online programs. It delivers multi‑di…
☆277Sep 28, 2025Updated 10 months ago
reneverland / CBIT-AiStudio
View on GitHub
This is a enterprise-level AI image generation platform based on ComfyUI, focusing on photorealistic human image generation. It advanced …
☆254Oct 3, 2025Updated 9 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 5 months ago
irisx3 / attention_drl_trading
View on GitHub
Attention-based Deep Reinforcement Learning framework for portfolio allocation on S&P 500 equities. Includes custom environment, policy a…
☆164Oct 16, 2025Updated 9 months ago
Sijie-Yang / TripCard
View on GitHub
☆72Oct 18, 2025Updated 9 months ago
sotayang / SVBench
View on GitHub
[ICLR'2025 Spotlight] Official repository for "SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding"
☆121Jul 2, 2026Updated 3 weeks ago
showlab / livecc
View on GitHub
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
☆469Oct 29, 2025Updated 9 months ago
Mark12Ding / Dispider
View on GitHub
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
☆180Mar 23, 2025Updated last year
OmniMMI / OmniMMI
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆23Jul 14, 2026Updated 2 weeks ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
EvolvingLMMs-Lab / OneVision-Encoder
View on GitHub
Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence
☆386Jun 20, 2026Updated last month
EvolvingLMMs-Lab / lmms-eval
View on GitHub
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
☆4,337Updated this week
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆45Feb 5, 2025Updated last year
1ranGuan / VST
View on GitHub
[ECCV 26] Video Streaming Thinking
☆117Jun 18, 2026Updated last month
OpenMOSS / MOSS-Video-Preview
View on GitHub
A real-time video understanding foundation model with gated cross-attention. Offline & real-time inference.
☆165Jul 16, 2026Updated last week
remifan / commplax
View on GitHub
The lib for https://remifan.github.io/gdbp_study/article.html see also: https://github.com/remifan/gdbp_study
☆295Mar 4, 2026Updated 4 months ago
SylviaLiuQAQ / -
View on GitHub
☆192Mar 14, 2025Updated last year