arturxe2 / ASTRA
PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNet Action Spotting Challenge 2023.
☆39Updated 11 months ago
Alternatives and similar repositories for ASTRA:
Users that are interested in ASTRA are comparing it to the libraries listed below
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆31Updated 2 weeks ago
- Make Your Training Flexible: Towards Deployment-Efficient Video Models☆24Updated last month
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆32Updated 11 months ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆57Updated 3 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated 5 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆21Updated 3 weeks ago
- This repo contains the code for our TMLR paper: A Simple Video Segmenter by Tracking Objects Along Axial Trajectories☆27Updated last month
- EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [CVPR'24]☆25Updated 8 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆58Updated 2 months ago
- Official implementation of Add-SD: Rational Generation without Manual Reference.☆27Updated 8 months ago
- Tracking through Containers and Occluders in the Wild (CVPR 2023) - Official Implementation☆41Updated 10 months ago
- [ICCV2023] MixSort: The Customized Tracker in SportsMOT☆78Updated last year
- Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model☆85Updated 3 weeks ago
- ☆65Updated 2 weeks ago
- VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆17Updated last week
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆50Updated 3 months ago
- [ECCV 2024] Official PyTorch implementation of TC-CLIP "Leveraging Temporal Contextualization for Video Action Recognition"☆58Updated 2 months ago
- 🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)☆36Updated 2 months ago
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆17Updated 3 months ago
- Multi-vision Sensor Perception and Reasoning (MS-PR) benchmark, assessing VLMs on their capacity for sensor-specific reasoning.☆15Updated 2 months ago
- EdgeSAM model for use with Autodistill.☆26Updated 10 months ago
- Dataset and Code for CVSports at CVPR 2024 paper "AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements"☆40Updated 10 months ago
- TensorFlow code for our ECCV'24 Workshop paper "LightAvatar: Efficient Head Avatar as Dynamic NeLF"☆28Updated 5 months ago
- [CVPR25] Official repository for the paper: "SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation"☆164Updated 2 weeks ago
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction☆99Updated last month
- A Modular End-to-End Tracking Framework for Research and Development 🎯🔬☆135Updated last week
- CAVIS: Context-Aware Video Instance Segmentation☆86Updated last week
- ☆19Updated last year
- ☆68Updated 10 months ago
- [arXiv'25] Official Implementation of "Pix2Cap-COCO: Advancing Visual Comprehension via Pixel-Level Captioning"☆16Updated 3 months ago