yellow-binary-tree/MMDuet2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yellow-binary-tree/MMDuet2)

yellow-binary-tree / MMDuet2

[ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning

☆40

Alternatives and similar repositories for MMDuet2

Users that are interested in MMDuet2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SooLab / EyeWO
View on GitHub
[NeurIPS2025] The official PyTorch implementation of the "Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video".
☆34Dec 25, 2025Updated 6 months ago
air-embodied-brain / Em-Garde
View on GitHub
Implementation of Em_Garde: a proposal-retrieval framework for streaming video understanding
☆26Jun 24, 2026Updated 3 weeks ago
apple / ml-streambridge
View on GitHub
☆40Nov 5, 2025Updated 8 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆78Feb 25, 2026Updated 4 months ago
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆153Jul 24, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆131Nov 4, 2025Updated 8 months ago
aurateam2026 / AURA
View on GitHub
☆114Jun 5, 2026Updated last month
OmniMMI / OmniMMI
View on GitHub
[CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts
☆23Updated this week
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated 3 weeks ago
CASIA-IVA-Lab / ThinkStream
View on GitHub
☆40Jun 18, 2026Updated last month
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆121Nov 4, 2025Updated 8 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated 3 weeks ago
haowei-freesky / HERMES
View on GitHub
Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]
☆92May 8, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
minghangz / OnVTG
View on GitHub
Online video temporal grounding
☆16Oct 20, 2025Updated 9 months ago
depixels / Advanced-LLM-Systems-Learning
View on GitHub
本项目旨在系统性学习和记录大语言模型（LLM）系统领域的核心知识，重点关注分布式训练、推理优化、强化学习（RLHF）对齐的原理、主流框架和工程实践。
☆15Apr 9, 2026Updated 3 months ago
IVGSZ / Flash-VStream
View on GitHub
This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"
☆285Oct 15, 2025Updated 9 months ago
EIT-NLP / Awesome-Streaming-LLMs
View on GitHub
🔥This is a repository of paper list for streaming LLMs/MLLMs.
☆24Apr 19, 2026Updated 3 months ago
yaolinli / TimeChat-Captioner
View on GitHub
[ICML 2026] Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
☆47Jun 29, 2026Updated 3 weeks ago
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 10 months ago
sotayang / Awesome-Streaming-Video-Understanding
View on GitHub
🔥🔥🔥 [Awesome] Latest Papers, Codes & Datasets on Streaming / Online Video Understanding — Building Always-on, Real-time Video AI 🤖
☆409Jul 2, 2026Updated 2 weeks ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆148May 1, 2026Updated 2 months ago
showlab / livecc
View on GitHub
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
☆462Oct 29, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mit-han-lab / streaming-vlm
View on GitHub
StreamingVLM: Real-Time Understanding for Infinite Video Streams
☆1,046Oct 15, 2025Updated 9 months ago
lcqysl / FrameThinker
View on GitHub
[ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"
☆50Oct 9, 2025Updated 9 months ago
EIT-NLP / Speak-While-Watching
View on GitHub
☆17Mar 1, 2026Updated 4 months ago
Kwai-Klear / mini-swe-agent-plus
View on GitHub
mini-swe-agent-plus: a tiny (~100 LOC) GitHub issue fixer—now with a robust multi-line text edit tool.
☆24Jan 20, 2026Updated 6 months ago
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
pro-assist / ProAssist
View on GitHub
☆20Jul 21, 2025Updated 11 months ago
iLearn-Lab / CVPR25-LION-FS
View on GitHub
[CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
☆29Dec 2, 2025Updated 7 months ago
xinding-bot / StreamMind
View on GitHub
[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
☆71Jun 25, 2025Updated last year
showlab / videollm-online
View on GitHub
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
☆674Nov 26, 2025Updated 7 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
EIT-NLP / StreamingLLM
View on GitHub
Repository of Streaming LLMs
☆89Jun 20, 2026Updated last month
BeyondScene / BeyondScene
View on GitHub
[ECCV 2024] BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
☆21Jul 2, 2024Updated 2 years ago
wgcyeo / WorldMM
View on GitHub
[CVPR 2026 Highlight] WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
☆96Jun 18, 2026Updated last month
Li-Hao-yuan / GeoThinker
View on GitHub
☆68Feb 12, 2026Updated 5 months ago
SooLab / MVTokenFlow
View on GitHub
[ICLR 2025] MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
☆27Apr 9, 2025Updated last year
lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆70Jun 8, 2026Updated last month
doc-doc / EgoBlind
View on GitHub
EgoBlind: Towards Egocentric Visual Assistance for the Blind (NeurIPS'25, D&B Track)
☆23Apr 20, 2026Updated 3 months ago