1ranGuan/VST

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/1ranGuan/VST)

1ranGuan / VST

[ECCV 26] Video Streaming Thinking

☆114

Alternatives and similar repositories for VST

Users that are interested in VST are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

1ranGuan / thinkomni
View on GitHub
[ICLR26] ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding
☆93Mar 20, 2026Updated 4 months ago
H-EmbodVis / HERMESV2
View on GitHub
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation
☆65May 1, 2026Updated 2 months ago
XenoZLH / Shuffle-R1
View on GitHub
Official code repository of Shuffle-R1
☆26Feb 23, 2026Updated 4 months ago
H-EmbodVis / NUMINA
View on GitHub
[CVPR 2026] When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models
☆68Apr 11, 2026Updated 3 months ago
H-EmbodVis / PointTPA
View on GitHub
[CVPR 2026] PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding
☆33Apr 7, 2026Updated 3 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
H-EmbodVis / DOMINO
View on GitHub
[ECCV 2026] Towards Generalizable Robotic Manipulation in Dynamic Environments
☆226Jun 30, 2026Updated 2 weeks ago
DYZhang09 / ViTWSS3D
View on GitHub
[ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection
☆13Apr 12, 2024Updated 2 years ago
H-EmbodVis / MERGE
View on GitHub
[NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
☆219Oct 31, 2025Updated 8 months ago
dk-liang / UniFuture
View on GitHub
[ICRA 2026] UniFuture: A 4D Driving World Model for Future Generation and Perception
☆161Feb 26, 2026Updated 4 months ago
CASIA-IVA-Lab / ThinkStream
View on GitHub
☆40Jun 18, 2026Updated last month
LMD0311 / HERMES
View on GitHub
[ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
☆259May 12, 2026Updated 2 months ago
DYZhang09 / ToC3D
View on GitHub
[ECCV 2024] Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression
☆53Sep 21, 2024Updated last year
xiaomi-mlab / MindDrive
View on GitHub
[ECCV 2026] Official code of “MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning”
☆241Jun 23, 2026Updated 3 weeks ago
H-EmbodVis / VEGA-3D
View on GitHub
[ECCV 2026] Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
☆421Jun 18, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
H-EmbodVis / GRANT
View on GitHub
[AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution
☆363Dec 12, 2025Updated 7 months ago
HongkLin / TIDE
View on GitHub
[CVPR 2025] A Unified Image-Dense Annotation Generation Model for Underwater Scenes
☆60Apr 9, 2025Updated last year
dk-liang / UniSeg3D
View on GitHub
[NeurIPS 2024] A Unified Framework for 3D Scene Understanding
☆179Jul 7, 2025Updated last year
zc-zhao / DriveMonkey
View on GitHub
the official code of DriveMonkey
☆45Mar 20, 2026Updated 4 months ago
H-EmbodVis / HyDRA
View on GitHub
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
☆265Apr 29, 2026Updated 2 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆78Feb 25, 2026Updated 4 months ago
H-EmbodVis / EasyCache
View on GitHub
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching
☆291May 12, 2026Updated 2 months ago
EIT-NLP / StreamingLLM
View on GitHub
Repository of Streaming LLMs
☆89Jun 20, 2026Updated last month
HumanMLLM / LOVE-R1
View on GitHub
Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"
☆24Nov 1, 2025Updated 8 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
gangweix / next-forcing
View on GitHub
Next Forcing: World Action Modeling with Multi-Chunk Prediction (MCP)
☆109Updated this week
MCG-NJU / StreamForest
View on GitHub
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
☆131Nov 4, 2025Updated 8 months ago
xiaomi-mlab / Orion
View on GitHub
[ICCV 2025] Official code of "ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation"
☆652Jun 22, 2026Updated 3 weeks ago
xiaofei030 / campus_sentiment_analysis
View on GitHub
一款基于多智能体 + MCP + Skill 架构的校园舆情监测与情感分析平台，面向辅导员和学生管理人员。旨帮助学校更好地了解学生心声，提供更贴心的服务，切实解决同学们关切的问愿。
☆92Feb 26, 2026Updated 4 months ago
sg-3d / sg3d
View on GitHub
☆55Oct 3, 2024Updated last year
H-EmbodVis / NAUTILUS
View on GitHub
[NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding
☆368Dec 18, 2025Updated 7 months ago
wanglu-cs / Think_While_Watching
View on GitHub
☆19Jun 26, 2026Updated 3 weeks ago
EvolvingLMMs-Lab / SimpleStream
View on GitHub
A simple video streaming baseline that outperforms SOTAs.
☆148May 1, 2026Updated 2 months ago
shalfun / DriVerse
View on GitHub
[ACMMM 2025] Officially implement of the paper "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompti…
☆220May 7, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Vchitect / Uni-MMMU
View on GitHub
[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark
☆25Apr 13, 2026Updated 3 months ago
JoeLeelyf / OVO-Bench
View on GitHub
[CVPR 2025] OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
☆153Jul 24, 2025Updated 11 months ago
IVUL-KAUST / VideoAuto-R1
View on GitHub
[CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
☆88Feb 27, 2026Updated 4 months ago
CryptoDmitry / hermes-agent-control-room
View on GitHub
Control Room-first template for managing Hermes agents from one VPS agent to specialist teams and orchestrated workflows
☆868May 18, 2026Updated 2 months ago
Passenger12138 / attention-map-diffusers-vdm
View on GitHub
☆12Feb 13, 2025Updated last year
LSXI7 / MINIMA
View on GitHub
[CVPR 2025] MINIMA: Modality Invariant Image Matching
☆652Oct 9, 2025Updated 9 months ago
ydyhello / Awesome-VLM-Streaming-Video
View on GitHub
📚 A curated collection of papers and open-source code repositories dedicated to the application of Vision-Language Models (VLMs) for str…
☆188Jun 10, 2026Updated last month