MCG-NJU / Dynamic-MDETR
[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
☆13Updated last week
Related projects: ⓘ
- Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.☆13Updated 11 months ago
- [ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing☆27Updated 2 years ago
- SeqTR: A Simple yet Universal Network for Visual Grounding☆128Updated 3 months ago
- [T-PAMI 2023] Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection☆34Updated last year
- ☆34Updated 5 months ago
- ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning☆35Updated 11 months ago
- [CVPR 2024] Official PyTorch implementation of the paper "One For All: Video Conversation is Feasible Without Video Instruction Tuning"☆24Updated 7 months ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆38Updated 2 months ago
- The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"☆55Updated 5 months ago
- Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)☆54Updated 7 months ago
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆63Updated 3 months ago
- A lightweight codebase for referring expression comprehension and segmentation☆50Updated 2 years ago
- CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models☆52Updated 6 months ago
- (TIP 2024) Towards Robust Referring Image Segmentation☆20Updated 6 months ago
- Official pytorch repository for "TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection" (AAAI 2024 Pape…☆30Updated last month
- Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"☆91Updated 7 months ago
- ☆30Updated 9 months ago
- [CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"☆62Updated 4 months ago
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆26Updated 6 months ago
- ☆15Updated last year
- This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with H…☆17Updated 4 months ago
- [CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection☆172Updated 10 months ago
- [TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.☆104Updated 2 months ago
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆50Updated 3 months ago
- Official Implementation of SnAG (CVPR 2024)☆32Updated 4 months ago
- The official implementation of AdaTAD: End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames☆30Updated 2 months ago
- ☆45Updated last year
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆27Updated last month
- ☆39Updated 11 months ago
- [ICCV 2023] Official PyTorch implementation of the paper "DiffTAD: Temporal Action Detection with Proposal Denoising Diffusion"☆32Updated last year