EdenGabriel / TaskWeave
[CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
☆20Updated 3 months ago
Alternatives and similar repositories for TaskWeave:
Users that are interested in TaskWeave are comparing it to the libraries listed below
- Official pytorch repository for "TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection" (AAAI 2024 Pape …☆37Updated 3 weeks ago
- ☆25Updated 4 months ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆49Updated last year
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆44Updated 6 months ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆29Updated 9 months ago
- Official Pytorch Implementation of 'BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos'☆27Updated last year
- [NeurIPS 2023] The official implementation of SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation☆29Updated 10 months ago
- Composed Video Retrieval☆49Updated 8 months ago
- UniMD: Towards Unifying Moment retrieval and temporal action Detection☆41Updated 6 months ago
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval"☆16Updated 4 months ago
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆122Updated 4 months ago
- [ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.☆34Updated 3 months ago
- ☆37Updated 9 months ago
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆33Updated 4 months ago
- Open-vocabulary Video Instance Segmentation Codebase built upon Detectron2, which is really easy to use.☆20Updated 2 weeks ago
- The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024