GVCLab / VAU-R1Links
Official Implementation of "VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning".
☆36Updated 3 weeks ago
Alternatives and similar repositories for VAU-R1
Users that are interested in VAU-R1 are comparing it to the libraries listed below
Sorting:
- ☆48Updated last year
- [ICLR 2024] Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement.☆13Updated last year
- ☆24Updated 9 months ago
- ☆45Updated 6 months ago
- ☆52Updated last year
- ☆26Updated 11 months ago
- Teach-DETR: Better Training DETR with Teachers☆31Updated last year
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆81Updated 5 months ago
- [ECCV2024] ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference☆85Updated 3 months ago
- [ECCV 2024] Official repository for "DataDream: Few-shot Guided Dataset Generation"☆40Updated 11 months ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 6 months ago
- Official implementation of TagAlign☆35Updated 6 months ago
- UniMD: Towards Unifying Moment retrieval and temporal action Detection☆48Updated 11 months ago
- DiverGen (CVPR 2024) & BSGAL (ICML 2024)☆46Updated 3 months ago
- ☆30Updated last year
- ☆26Updated 11 months ago
- Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation☆49Updated last month
- [AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation☆29Updated last year
- ☆31Updated 9 months ago
- Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Prompting☆40Updated last week
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆31Updated 7 months ago
- Sambor: Boosting Segment Anything Model Towards Open-Vocabulary Learning☆30Updated last year
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆41Updated last year
- ☆32Updated last year
- [WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling☆53Updated last month
- [ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".☆28Updated 6 months ago
- [AAAI2023] Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition (SloshNet)☆13Updated last year
- ☆117Updated last year
- ☆58Updated last year
- Video Reasoning Segmentation☆20Updated 7 months ago