[CVPR2023] Code for "Streaming Video Model"
☆79Jun 14, 2023Updated 2 years ago
Alternatives and similar repositories for Streaming-Video-Model
Users that are interested in Streaming-Video-Model are comparing it to the libraries listed below
Sorting:
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆17Feb 13, 2025Updated last year
- [ICCV-2023]-Universal Video Segmentaion For VSS, VPS and VIS☆110Mar 18, 2024Updated last year
- DisTime: Distribution-based Time Representation for Video Large Language Models.☆18Jul 10, 2025Updated 7 months ago
- Official repository of "TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection"☆11May 25, 2025Updated 9 months ago
- Tracking Multiple Deformable Objects in Egocentric Videos (CVPR 2023)☆13Apr 10, 2023Updated 2 years ago
- 【CVPRW'23】First Place Solution to the CVPR'2023 AQTC Challenge☆15Jul 18, 2023Updated 2 years ago
- [CVPR-2023] Semantic-Promoted Debiasing and Background Disambiguation for Zero-Shot Instance Segmentation☆18Jul 2, 2023Updated 2 years ago
- A simple PyTorch implementation of CLIP model using DinoV2 and BERT☆15Sep 26, 2023Updated 2 years ago
- [NeurIPS 2025] PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer☆28Oct 2, 2025Updated 5 months ago
- This is the official implementation of work HiM2SAM in PRCV25.☆25Aug 30, 2025Updated 6 months ago
- ☆15Apr 9, 2023Updated 2 years ago
- Implementation of paper 'Helping Hands: An Object-Aware Ego-Centric Video Recognition Model'☆33Nov 7, 2023Updated 2 years ago
- ☆23Jun 13, 2023Updated 2 years ago
- [ECCV 2024] VISAGE: Video Instance Segmentation with Appearance-Guided Enhancement☆36Jul 29, 2024Updated last year
- Official PyTorch code of GroundVQA (CVPR'24)☆64Sep 13, 2024Updated last year
- ☆16Apr 4, 2025Updated 11 months ago
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆347May 27, 2024Updated last year
- An extension to help download models from CivitAi without leaving WebUI☆17Mar 6, 2023Updated 2 years ago
- ☆22Mar 4, 2024Updated 2 years ago
- EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments☆25May 15, 2025Updated 9 months ago
- A faster GMFSS☆18Apr 3, 2023Updated 2 years ago
- Related papers about Referring Image Segmentation (RIS)☆16Dec 26, 2023Updated 2 years ago
- [CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".☆54May 25, 2025Updated 9 months ago
- Quick Long Video Understanding [TMLR2025]☆76Oct 27, 2025Updated 4 months ago
- [TMLR'24] This repository includes the official implementation our paper "Unleashing the Power of Visual Prompting At the Pixel Level"☆42Apr 30, 2024Updated last year
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆86Oct 25, 2024Updated last year
- Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)☆22Sep 27, 2023Updated 2 years ago
- ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation (CVPR'25)☆18Apr 2, 2025Updated 11 months ago
- ☆120Feb 19, 2024Updated 2 years ago
- General video interaction platform based on LLMs, including Video ChatGPT☆256Jul 26, 2023Updated 2 years ago
- ☆20Jan 5, 2023Updated 3 years ago
- This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos☆19Mar 3, 2025Updated last year
- [CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding☆686Jan 29, 2025Updated last year
- ☆23Jul 26, 2024Updated last year
- ☆22Dec 11, 2024Updated last year
- spatio-temporal tasks☆16Jul 15, 2024Updated last year
- GPP CPassword Decryption Tools☆12Jun 13, 2022Updated 3 years ago
- ☆25Mar 30, 2025Updated 11 months ago
- Compress conventional Vision-Language Pre-training data☆53Sep 22, 2023Updated 2 years ago