LiChenyang-Github / LongShortNet
LongShortNet for Streaming Perception task.
☆10Updated last year
Related projects: ⓘ
- This is the code for CVPR2022 paper "Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation"☆18Updated last year
- ☆16Updated 2 years ago
- Video Feature Enhancement with PyTorch☆22Updated 7 months ago
- Unifying Visual Perception by Dispersible Points Learning (ECCV 2022)☆51Updated 2 years ago
- Teach-DETR: Better Training DETR with Teachers☆28Updated 6 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- Pytorch implementation of Location-free Camouflge Generation Network☆18Updated 2 years ago
- Lightweight Transformer for Multi-modal Tasks☆15Updated last year
- [AAAI-2024] Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception, Xiao Wang, Wentao Wu, Chenglong Li, Zhi…☆18Updated last month
- ☆17Updated 5 months ago
- ☆25Updated last year
- Code for Temporal Data Augmentations (ECCVW 2020)☆35Updated 4 years ago
- ☆16Updated this week
- ☆16Updated 3 years ago
- Code For Our Work: DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries [ECCV-2024]☆12Updated 2 months ago
- Frame Flexible Network (CVPR2023)☆52Updated last year
- Hawk: Learning to Understand Open-World Video Anomalies☆14Updated last month
- [AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition☆25Updated 2 months ago
- [CVPR 2022] Cross-Architecture Self-supervised Video Representation Learning☆22Updated 2 years ago
- ☆18Updated this week
- [NeurIPS 2022] code for the paper, SemMAE: Semantic-guided masking for learning masked autoencoders☆32Updated last year
- ☆15Updated 3 months ago
- [AAAI 2024] AVSegFormer: Audio-Visual Segmentation with Transformer☆52Updated 5 months ago
- A simple but efficient transformer model for video action recognition☆52Updated last year
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆22Updated last week
- CVPR2022 - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation☆22Updated 2 years ago
- A Siamese self-supervised pretraining approach for the Transformer architecture in DETR☆33Updated last year
- ☆32Updated 2 years ago
- [ICCV2023] Spatio-temporal Prompting Network for Robust Video Feature Extraction☆9Updated last year
- Code of Pyramid Vision Transformer at BMVC 2022☆26Updated last year