IVGSZ/Flash-VStream

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IVGSZ/Flash-VStream)

IVGSZ / Flash-VStream

This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"

☆285

Alternatives and similar repositories for Flash-VStream

Users that are interested in Flash-VStream are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆153Jul 24, 2025Updated 11 months ago
showlab / videollm-online
View on GitHub
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
☆674Nov 26, 2025Updated 7 months ago
shiyi-zh0408 / NAE_CVPR2024
View on GitHub
[CVPR 2024] Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
☆43May 16, 2024Updated 2 years ago
zhang9302002 / ThinkingWithVideos
View on GitHub
The official code of "Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning"
☆101Oct 15, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
MCG-NJU / VideoChat-Online
View on GitHub
[CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online
☆97Oct 7, 2025Updated 9 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆121Nov 4, 2025Updated 8 months ago
AndyTang15 / FLAG3Dv2
View on GitHub
☆25May 9, 2024Updated 2 years ago
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated 3 weeks ago
HumanMLLM / ViSpeak
View on GitHub
(ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"
☆52Jul 1, 2025Updated last year
Yxxxb / LAVT-RS
View on GitHub
[CVPR'2022, TPAMI'2024] LAVT: Language-Aware Vision Transformer for Referring Segmentation
☆26Jan 21, 2025Updated last year
yongliu20 / Awesome-Unified-Understanding-and-Generation
View on GitHub
☆52Aug 22, 2025Updated 10 months ago
Mark12Ding / Dispider
View on GitHub
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
☆180Mar 23, 2025Updated last year
Yxxxb / VoCo-LLaMA
View on GitHub
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
☆205Jun 18, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
wenhaochai / MovieChat
View on GitHub
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
☆704Jan 29, 2025Updated last year
yellow-binary-tree / MMDuet2
View on GitHub
[ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
☆40Jan 14, 2026Updated 6 months ago
SuleBai / SC-CLIP
View on GitHub
[TIP 2025] Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
☆72Mar 27, 2026Updated 3 months ago
EvolvingLMMs-Lab / LongVA
View on GitHub
Long Context Transfer from Language to Vision
☆407Mar 18, 2025Updated last year
bigai-nlco / VideoLLaMB
View on GitHub
[ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
☆87Feb 27, 2025Updated last year
yongliu20 / SCAN
View on GitHub
[CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"
☆77Sep 23, 2024Updated last year
Vision-CAIR / LongVU
View on GitHub
[ICML 2025] Official PyTorch implementation of LongVU
☆429May 8, 2025Updated last year
Tencent-QQMM / Video-CCAM
View on GitHub
A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.
☆73Oct 14, 2024Updated last year
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
InvincibleWyq / ChatVID
View on GitHub
Chat about anything on any video!
☆39Sep 5, 2023Updated 2 years ago
boheumd / MA-LMM
View on GitHub
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
☆350Jul 19, 2024Updated 2 years ago
yellow-binary-tree / MMDuet
View on GitHub
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interact…
☆43Feb 5, 2025Updated last year
AndyTang15 / FLAG3D
View on GitHub
☆19Jun 22, 2026Updated 3 weeks ago
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆131Nov 4, 2025Updated 8 months ago
AMAP-ML / UniVG-R1
View on GitHub
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
☆165Jun 2, 2025Updated last year
JUNJIE99 / MLVU
View on GitHub
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
☆263Apr 13, 2026Updated 3 months ago
OpenGVLab / VideoChat-Flash
View on GitHub
[ICLR2026] VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling
☆525Updated this week
magic-research / PLLaVA
View on GitHub
Official repository for the paper PLLaVA
☆669Jul 28, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
gls0425 / LinVT
View on GitHub
LinVT: Empower Your Image-level Large Language Model to Understand Videos
☆83Dec 30, 2024Updated last year
THUNLP-MT / StreamingBench
View on GitHub
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
☆167May 16, 2025Updated last year
joslefaure / HERMES
View on GitHub
[ICCV'25] HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics
☆37Sep 10, 2025Updated 10 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆78Feb 25, 2026Updated 4 months ago
VectorSpaceLab / Video-XL
View on GitHub
🔥🔥First-ever hour scale video understanding models
☆626Jul 14, 2025Updated last year
apple / ml-slowfast-llava
View on GitHub
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
☆290Sep 16, 2024Updated last year
Ahnsun / merlin
View on GitHub
[ECCV2024] Official code implementation of Merlin: Empowering Multimodal LLMs with Foresight Minds
☆97Jul 4, 2024Updated 2 years ago