[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆123Nov 4, 2025Updated 6 months ago
Alternatives and similar repositories for StreamForest
Users that are interested in StreamForest are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [CVPR 2025] LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant☆30Dec 2, 2025Updated 5 months ago
- [ICLR 2026] MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning☆35Jan 14, 2026Updated 4 months ago
- [CVPR 2025] OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts☆17Apr 2, 2025Updated last year
- [CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos☆11Jun 11, 2024Updated last year
- [WACV 2025 Oral] Transferring Foundation Models for Generalizable Robotic Manipulation☆27Mar 28, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [CVPR 2026] Accelerating Streaming Video Large Language Models via Hierarchical Token Compression☆65Feb 25, 2026Updated 2 months ago
- VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model☆15Jul 31, 2025Updated 9 months ago
- [ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision☆12Sep 17, 2023Updated 2 years ago
- Open-Vocabulary Panoptic Segmentation☆27Jun 15, 2025Updated 11 months ago
- [NeurIPS 2024] Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution☆35Dec 23, 2024Updated last year
- Code of the Grounded MUIE model, REAMO☆10Dec 3, 2024Updated last year
- 使用yolov5训练模型,实现鼠标控制,屏幕抓取,实时识别自瞄。Train a model using YOLOv5 to achieve mouse control, screen capture, and real-time auto-aim recognition.☆12Jun 13, 2024Updated last year
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆94Oct 7, 2025Updated 7 months ago
- [CVPR 2024] Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations☆24Jan 20, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Implementation for ACM MM2024 paper "VrdONE: One-stage Video Visual Relation Detection".☆12Nov 13, 2024Updated last year
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Apr 18, 2026Updated last month
- [AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets☆38Aug 20, 2024Updated last year
- [ICLR'25] Streaming Video Question-Answering with In-context Video KV-Cache Retrieval☆121Nov 4, 2025Updated 6 months ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated 11 months ago
- [ICCV 2025] Object-centric Video Question Answering with Visual Grounding and Referring☆25Aug 8, 2025Updated 9 months ago
- [NeurIPS 2025] Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior☆79Feb 20, 2026Updated 2 months ago
- [Awesome] 🔥🔥🔥 Latest Papers, Codes and Datasets on Streaming / Online Video Understanding☆236May 7, 2026Updated last week
- ☆17Dec 13, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆33Dec 22, 2025Updated 4 months ago
- [NeurIPS'2025] Official repository for "LiveStar: Live Streaming Assistant for Real-World Online Video Understanding"☆115Nov 25, 2025Updated 5 months ago
- OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams☆99Mar 15, 2026Updated 2 months ago
- [ICLR 2025] Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate☆20Apr 22, 2025Updated last year
- [NeurIPS'25] ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding☆52Sep 21, 2025Updated 7 months ago
- [Neurocomputing] EmoVerse: Enhancing Multimodal Large Language Models for Affective Computing via Multitask Learning☆18Jul 6, 2025Updated 10 months ago
- [CVPR 2025] Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning☆57Apr 1, 2025Updated last year
- [ICCV 2021] Relaxed Transformer Decoders for Direct Action Proposal Generation☆92Apr 5, 2022Updated 4 years ago
- [ECCV 2024] RGBD GS-ICP SLAM☆14Nov 5, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Mar 23, 2026Updated last month
- [CVPR2025] Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models☆20Apr 30, 2025Updated last year
- The implementation of paper ''Efficient Attention Network: Accelerate Attention by Searching Where to Plug''.☆20Jun 16, 2023Updated 2 years ago
- Official PyTorch implementation Source code for Weakly Supervised Video Scene Graph Generation via Natural Language Supervision, accepted…☆24Jun 13, 2025Updated 11 months ago
- ☆18Nov 30, 2025Updated 5 months ago
- ☆19Oct 22, 2023Updated 2 years ago
- RLHF for Video Diffusion Models☆26Jul 30, 2025Updated 9 months ago