YoucanBaby / MH-DETRLinks
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
☆20Updated last year
Alternatives and similar repositories for MH-DETR
Users that are interested in MH-DETR are comparing it to the libraries listed below
Sorting:
- (IJCV 2024 & ACM MM 2021 Oral) Multi-Source Fusion and Automatic Predictor Selection for Zero-Shot Video Object Segmentation☆119Updated 3 years ago
- Practical New Tasks and Inspiring Modeling Solutions for Diverse Open Vision Problems☆140Updated 2 months ago
- ☆207Updated 7 months ago
- [NeurIPS 2025] NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding☆346Updated last week
- PySegMetrics (PSM): A Python-based Simple yet Efficient Evaluation Toolbox for Segmentation-like tasks☆123Updated last year
- [NeurIPS 2025] More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models☆215Updated last month
- CVPR2025☆44Updated 9 months ago
- [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies☆224Updated 8 months ago
- Official repository of MMGenBench☆120Updated 9 months ago
- A curated collection of AI+X papers published in Nature / Science / Cell / Lancet / Radiology and their flagship sub-journals☆136Updated 2 months ago
- PyTorch implementation for "Unlearning the Noisy Correspondence Makes CLIP More Robust (ICCV 2025)"☆68Updated 3 months ago
- High Quality Video Reasoning Segmentation☆130Updated last month
- Official Pytorch implementation for ICML 2025 paper "Large Continual Instruction Assistant"☆66Updated this week
- [CVPR 2025 Highlight] Official Implementation of SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity☆116Updated this week
- The summary of code and paper for unified model towards context-dependent (CD) concept segmentation.☆119Updated 4 months ago
- **Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.☆320Updated last month
- Data and sample evaluation codes for Multimodal Rewardbench 2☆117Updated last week
- This is the source code for the ECCV paper "MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning"☆200Updated 3 years ago
- 🔥 [AAAI 2026 Oral] Official code for Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptat…☆72Updated last year
- CoS: Chain-of-Shot Prompting for Long Video Understanding☆52Updated 10 months ago
- (ECCV 2024) Open-Vocabulary Camouflaged Object Segmentation☆254Updated 4 months ago
- Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better☆184Updated 6 months ago
- ☆67Updated 4 months ago
- (ICML 2024) Spider: A Unified Framework for Context-dependent Concept Segmentation☆351Updated 9 months ago
- ☆198Updated 2 months ago
- Offical Code of MICCAI'24 early accepted paper "LGRNet: Local-Global Reciprocal Network for Uterine Fibroid Segmentation in Ultrasound Vi…☆170Updated last year
- Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation (ICCV2021)☆199Updated 4 years ago
- (TIP 2022) Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction☆109Updated 9 months ago
- (CVPR 2024 & arXiv 2025) Power Battery Detection☆310Updated 3 months ago
- [AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic al…☆115Updated last month