woven-visionai / wts-datasetLinks
☆41Updated last week
Alternatives and similar repositories for wts-dataset
Users that are interested in wts-dataset are comparing it to the libraries listed below
Sorting:
- [CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of t…☆39Updated 4 months ago
- ☆48Updated 11 months ago
- ☆47Updated last year
- ☆14Updated last year
- [CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detect…☆56Updated 2 months ago
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆37Updated last year
- AICITY2024 Track 2 - Code from AIO_ISC Team☆34Updated 11 months ago
- [ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking☆48Updated 7 months ago
- ☆9Updated 9 months ago
- Video Feature Enhancement with PyTorch☆31Updated 7 months ago
- ☆38Updated last year
- [CVPR2024 Highlight] The official repo for paper "Abductive Ego-View Accident Video Understanding for Safe Driving Perception"☆54Updated 3 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆76Updated 8 months ago
- Official project page of the paper "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges" (Accep…☆48Updated last year
- Codes From Top Teams in 2023 AIC challenge☆79Updated 2 years ago
- Official Code of CVPR'23 Paper "VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision"☆22Updated last year
- Improving Mamaba performance on Video Understanding task☆40Updated 8 months ago
- ☆30Updated last year
- Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent …☆26Updated 2 years ago
- OvarNet official implement of the paper "OvarNet: Towards Open-vocabulary Object Attribute Recognition"☆104Updated 2 years ago
- [ECCV 2024] Beyond MOT: Semantic Multi-Object Tracking☆30Updated 9 months ago
- Foundation Models for Video Understanding: A Survey☆123Updated 9 months ago
- codes from top teams of AI City Challenge 2022☆94Updated 3 years ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆46Updated 3 weeks ago
- Official implementation of "Delving into CLIP latent space for Video Anomaly Recognition", CVIU 2024☆67Updated 7 months ago
- [CVPR-2023 Workshop@NFVLR] Official PyTorch implementation of Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestr…☆28Updated 3 months ago
- Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs☆91Updated 5 months ago
- FreeVA: Offline MLLM as Training-Free Video Assistant☆60Updated last year
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆37Updated last year
- TF-CLIP: Learning Text-Free CLIP for Video-Based Person Re-identification (AAAI2024)☆50Updated last year