Yang011013/Awesome-Streaming-Video-Understanding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Yang011013/Awesome-Streaming-Video-Understanding)

Yang011013 / Awesome-Streaming-Video-Understanding

Awesome latest models, datasets and benchmarks on streaming/online video understanding.

☆31

Alternatives and similar repositories for Awesome-Streaming-Video-Understanding

Users that are interested in Awesome-Streaming-Video-Understanding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

G-JWLee / TAMP
View on GitHub
☆12May 15, 2025Updated last year
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆133Nov 4, 2025Updated 8 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆83Feb 25, 2026Updated 5 months ago
ydyhello / Awesome-VLM-Streaming-Video
View on GitHub
📚 A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for str…
☆189Updated this week
jylins / videoseek
View on GitHub
[CVPR 2026] VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
☆64Mar 23, 2026Updated 4 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated 3 weeks ago
bethgelab / supersanity
View on GitHub
A critical analysis of the Cambrian-S model and VSI-Super benchmarks
☆16Nov 20, 2025Updated 8 months ago
aiha-lab / InfiniPot-V
View on GitHub
[NeurIPS 25] InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
☆20Jan 25, 2026Updated 6 months ago
lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆70Jun 8, 2026Updated last month
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆152May 1, 2026Updated 2 months ago
nlp-waseda / traveling-across-languages
View on GitHub
Official repo and evaluation implementation of KnowRecall and VisRecall
☆10May 22, 2025Updated last year
pipixin321 / Awesome-Video-MLLMs
View on GitHub
Awesome MLLMs/Benchmarks for Short/Long/Streaming Video Understanding
☆71Sep 1, 2025Updated 10 months ago
xlyu0106 / VisMem
View on GitHub
☆91Feb 5, 2026Updated 5 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Aoko955 / Flash-VAED
View on GitHub
[ICML 2026] Official codebase for "Flash-VAED: Plug-and-Play VAE Decoders for Efficient Video Generation"
☆27May 9, 2026Updated 2 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
ShareLab-SII / FluxMem
View on GitHub
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
☆73Mar 16, 2026Updated 4 months ago
CYWang735 / AdaTooler-V
View on GitHub
☆71Feb 27, 2026Updated 4 months ago
LaVi-Lab / Rethink_CoT_Video
View on GitHub
Official code for "Rethinking Chain-of-Thought Reasoning for Videos"
☆21Dec 14, 2025Updated 7 months ago
YIGE24 / StreamingTOM
View on GitHub
☆27Mar 5, 2026Updated 4 months ago
AlbertTan404 / Think-Then-React
View on GitHub
[ICLR 2025] Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation
☆21Mar 21, 2025Updated last year
fast3r-3d / fast3r-3d.github.io
View on GitHub
☆11Mar 4, 2025Updated last year
zjuruizhechen / Awesome-Video-Agent
View on GitHub
A collection of awesome think with videos papers.
☆100Dec 1, 2025Updated 7 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
zhengzhi-1997 / LLM-TRSR
View on GitHub
☆16May 22, 2025Updated last year
CR400AF-A / SparseMM
View on GitHub
[ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs
☆88Jan 17, 2026Updated 6 months ago
huweim / dataflow_architecture
View on GitHub
Research about dataflow architecture
☆15Nov 30, 2023Updated 2 years ago
iLearn-Lab / CVPR25-LION-FS
View on GitHub
[CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
☆29Dec 2, 2025Updated 7 months ago
THUNLP-MT / StreamingBench
View on GitHub
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
☆167May 16, 2025Updated last year
yellow-binary-tree / MMDuet2
View on GitHub
[ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
☆42Jan 14, 2026Updated 6 months ago
fansunqi / VideoTool
View on GitHub
Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"
☆23May 18, 2026Updated 2 months ago
youngfly11 / ReIR-WeaklyGrounding.pytorch
View on GitHub
The official PyTorch code for "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding" accepted by CVPR2021
☆28Oct 9, 2021Updated 4 years ago
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆102Oct 15, 2025Updated 9 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
zhshj0110 / Awesome-Motion-Diffusion-Models
View on GitHub
A collection of resources and papers on Motion Diffusion Models.
☆39Jun 10, 2025Updated last year
JackYu6 / HERO_release
View on GitHub
PyTorch implementation of "HERO: Human Reaction Generation from Videos (ICCV 2025)"
☆37Mar 27, 2026Updated 4 months ago
SijieSong / CVPR21-Cogrounding_semantic_attention
View on GitHub
☆14Jul 13, 2021Updated 5 years ago
gfloros / OpenGlue
View on GitHub
Open Source Graph Neural Net Based Pipeline for Image Matching
☆21Jun 6, 2025Updated last year
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆45Feb 5, 2025Updated last year
w-yibo / VTC-R1
View on GitHub
VTC-R1: Vision-Text Compression for Efficient Long-Context Reasoning.
☆26Updated this week
thustorage / Frugal
View on GitHub
for paper @ ASPLOS‘25’
☆16Mar 27, 2025Updated last year