quangminhdinh / TrafficVLMLinks
[CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of the AI City Challenge 2024 Track 2.
☆39Updated 4 months ago
Alternatives and similar repositories for TrafficVLM
Users that are interested in TrafficVLM are comparing it to the libraries listed below
Sorting:
- ☆41Updated last week
- [ICCV2023] Tem-adapter: Adapting Image-Text Pretraining for Video Question Answer☆37Updated last year
- ☆90Updated 2 months ago
- [ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM☆76Updated 8 months ago
- Improving Mamaba performance on Video Understanding task☆40Updated 8 months ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆34Updated last year
- ☆97Updated 10 months ago
- [CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆46Updated 3 weeks ago
- [CVPR2024 Highlight] The official repo for paper "Abductive Ego-View Accident Video Understanding for Safe Driving Perception"☆54Updated 3 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆20Updated 2 months ago
- [CVPR 2025] Online Video Understanding: OVBench and VideoChat-Online☆43Updated 2 months ago
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆41Updated 9 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection