haowei-freesky/HERMES

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/haowei-freesky/HERMES)

haowei-freesky / HERMES

Official Repository for paper "HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding" [ACL 2026]

☆93

Alternatives and similar repositories for HERMES

Users that are interested in HERMES are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

lern-to-write / STC
View on GitHub
[CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression
☆70Jun 8, 2026Updated last month
aiha-lab / InfiniPot-V
View on GitHub
[NeurIPS 25] InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding
☆20Jan 25, 2026Updated 6 months ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆153May 1, 2026Updated 2 months ago
Becomebright / ReKV
View on GitHub
[ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
☆122Nov 4, 2025Updated 8 months ago
sotayang / Awesome-Streaming-Video-Understanding
View on GitHub
🔥🔥🔥 [Awesome] Latest Papers, Codes & Datasets on Streaming / Online Video Understanding — Building Always-on, Real-time Video AI 🤖
☆419Jul 2, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apple / ml-streambridge
View on GitHub
☆40Nov 5, 2025Updated 8 months ago
ShareLab-SII / FluxMem
View on GitHub
[CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding
☆74Mar 16, 2026Updated 4 months ago
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆133Nov 4, 2025Updated 8 months ago
yellow-binary-tree / MMDuet2
View on GitHub
[ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
☆42Jan 14, 2026Updated 6 months ago
OpenMOSS / FutureOmni
View on GitHub
☆26Jan 22, 2026Updated 6 months ago
SooLab / EyeWO
View on GitHub
[NeurIPS2025] The official PyTorch implementation of the "Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video".
☆35Dec 25, 2025Updated 7 months ago
air-embodied-brain / Em-Garde
View on GitHub
Implementation of Em_Garde: a proposal-retrieval framework for streaming video understanding
☆26Jun 24, 2026Updated last month
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 5 months ago
ydyhello / Awesome-VLM-Streaming-Video
View on GitHub
📚 A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for str…
☆189Jul 22, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆155Jul 24, 2025Updated last year
dengandong / GroundMoRe
View on GitHub
☆18May 18, 2026Updated 2 months ago
city1517 / FlexMem
View on GitHub
[CVPR2026 Highlight] FlexMem: Scaling the Long Video Understanding of MLLMs via Visual Memory Mechanism
☆31Apr 10, 2026Updated 3 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated last month
Jialuo-Li / DIG
View on GitHub
[CVPR 2026] Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
☆21Feb 21, 2026Updated 5 months ago
mit-han-lab / streaming-vlm
View on GitHub
StreamingVLM: Real-Time Understanding for Infinite Video Streams
☆1,048Oct 15, 2025Updated 9 months ago
YIGE24 / StreamingTOM
View on GitHub
☆27Mar 5, 2026Updated 4 months ago
KD-TAO / DyCoke
View on GitHub
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆114Nov 22, 2025Updated 8 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆83Feb 25, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hmxiong / StreamChat
View on GitHub
Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025
☆111Mar 14, 2025Updated last year
OpenMOSS / BandPO
View on GitHub
Official implementation of BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning.…
☆49Apr 8, 2026Updated 3 months ago
aurateam2026 / AURA
View on GitHub
☆118Jun 5, 2026Updated last month
zhengdian1 / AIA
View on GitHub
☆45Jan 4, 2026Updated 6 months ago
xuyang-liu16 / VidCom2
View on GitHub
[EMNLP 2025 Main] Video Compression Commander: Plug-and-Play Inference Acceleration for Video Large Language Models
☆128May 14, 2026Updated 2 months ago
xinding-bot / StreamMind
View on GitHub
[ICCV 2025] StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
☆73Jun 25, 2025Updated last year
yaolinli / TimeChat-Online
View on GitHub
[ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
☆132Jun 29, 2026Updated last month
viiika / Prism
View on GitHub
[ICML 2026] Official Implementation of Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diff…
☆22Mar 4, 2026Updated 4 months ago
HYUNJS / STTM
View on GitHub
[ICCV 2025] Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
☆61Feb 2, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
OpenGVLab / PVC
View on GitHub
[CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models
☆54Jun 12, 2025Updated last year
LunarShen / FastVID
View on GitHub
[NeurIPS 2025] FastVID: Dynamic Density Pruning for Fast Video Large Language Models
☆37Nov 10, 2025Updated 8 months ago
Mark12Ding / Dispider
View on GitHub
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
☆180Mar 23, 2025Updated last year
EliSpectre / MM-Mem
View on GitHub
[ACL-26 (main)] From Verbatim to Gist Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video A…
☆39Apr 19, 2026Updated 3 months ago
PeiwenSun2000 / SpaceVista
View on GitHub
The official repo for SpaceVista: All-Scale Visual Spatial Reasoning from mm to km.
☆43May 26, 2026Updated 2 months ago
daeunni / StreamGaze
View on GitHub
Code for "StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos"
☆27May 13, 2026Updated 2 months ago
euReKa025 / AgentLongBench
View on GitHub
☆22Jan 29, 2026Updated 6 months ago