Dotori-HJ / TE-TAD
[CVPR 2024] Official implementation of the paper "TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression"
☆17Updated 7 months ago
Alternatives and similar repositories for TE-TAD:
Users that are interested in TE-TAD are comparing it to the libraries listed below
- Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"☆39Updated 3 months ago
- Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".☆28Updated 6 months ago
- A curated list of awesome self-supervised learning methods in videos☆126Updated last month
- [CVPR2024] The official implementation of AdaTAD: End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames☆33Updated 7 months ago
- ☆47Updated last year
- ☆25Updated 4 months ago
- Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentatio…☆55Updated last month
- [CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval☆49Updated 8 months ago
- [ECCV’24] Official Implementation for CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenario…☆48Updated 5 months ago
- [CVPR2023] Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning (https://arxiv…☆116Updated last year
- An unofficial implementation of TubeViT in "Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning"☆89Updated 5 months ago
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆49Updated 4 months ago
- Official Implementation of SnAG (CVPR 2024)☆42Updated 3 months ago
- Official repository for "Boosting Audio Visual Question Answering via Key Semantic-Aware Cues" in ACM MM 2024.☆14Updated 3 months ago
- Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Gr…☆126Updated 6 months ago
- Official repository for "Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition" [ICCV 2023]☆95Updated 9 months ago
- ☆31Updated 3 months ago
- [TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset☆277Updated last month
- MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation Models (CVPR 2023)☆33Updated 10 months ago
- Code for Diffusion Action Segmentation (ICCV 2023)☆59Updated last year
- [CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".☆265Updated 10 months ago
- [ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models☆316Updated 8 months ago
- Future Transformer for Long-term Action Anticipation (CVPR 2022)☆49Updated 2 years ago
- Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)☆74Updated 6 months ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆49Updated last year
- Official Implementation of "The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval"☆77Updated 2 months ago
- [CVPR 2023] ViPLO - Official Pytorch Implementation☆40Updated last year
- End to End Streaming Video Temporal Segmentation☆24Updated 8 months ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆57Updated 8 months ago
- ☆23Updated 4 months ago