woven-visionai / wts-dataset
☆34Updated 2 weeks ago
Alternatives and similar repositories for wts-dataset
Users that are interested in wts-dataset are comparing it to the libraries listed below
Sorting:
- [CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of t…☆36Updated 3 months ago
- ☆47Updated 10 months ago
- ☆14Updated last year
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆36Updated last year
- [ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking☆47Updated 5 months ago
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆54Updated last month
- OvarNet official implement of the paper "OvarNet: Towards Open-vocabulary Object Attribute Recognition"☆102Updated 2 years ago
- [CVPR2024 Highlight] The official repo for paper "Abductive Ego-View Accident Video Understanding for Safe Driving Perception"☆52Updated last month
- Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"☆36Updated last year
- ☆18Updated 2 years ago
- [ICCV2023] AIDE: A Vision-Driven Multi-View, Multi-Modal, Multi-Tasking Dataset for Assistive Driving Perception☆42Updated last year
- [ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking☆30Updated 8 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- Codes From Top Teams in 2023 AIC challenge☆78Updated last year
- Foundation Models for Video Understanding: A Survey☆120Updated 8 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆73Updated 6 months ago
- Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)☆56Updated last year
- Official Code of CVPR'23 Paper "VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision"☆22Updated last year
- Official implementation of "Delving into CLIP latent space for Video Anomaly Recognition", CVIU 2024☆60Updated 5 months ago
- Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models☆107Updated last month
- Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent …☆26Updated last year
- [ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection☆98Updated last year
- Video Feature Enhancement with PyTorch☆29Updated 5 months ago
- OVTrack: Open-Vocabulary Multiple Object Tracking [CVPR 2023]☆102Updated 7 months ago
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆90Updated 4 months ago
- [CVPR 2025] DynRefer: Delving into Region-level Multimodal Tasks via Dynamic Resolution☆50Updated 2 months ago
- Improving Mamaba performance on Video Understanding task☆39Updated 6 months ago
- ☆36Updated 2 months ago
- Official project page of the paper "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges" (Accep…☆40Updated last year
- ☆38Updated 10 months ago