arturxe2 / ASTRALinks
PyTorch Implementation of "ASTRA: An Action Spotting TRAnsformer for Soccer Videos", ACM MMSports 2023. | 3rd place solution for SoccerNet Action Spotting Challenge 2023.
☆40Updated last year
Alternatives and similar repositories for ASTRA
Users that are interested in ASTRA are comparing it to the libraries listed below
Sorting:
- ☆68Updated last year
- Make Your Training Flexible: Towards Deployment-Efficient Video Models☆30Updated last month
- Video-LlaVA fine-tune for CinePile evaluation☆51Updated 11 months ago
- Official PyTorch implementation of "No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding"☆33Updated last year
- [Pattern Recognition 2024] Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models, Dong Li, Jiandon…☆17Updated 5 months ago
- Official implementation of Add-SD: Rational Generation without Manual Reference.☆27Updated 10 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated 8 months ago
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆35Updated 3 months ago
- CVPR 2025 Workshop on CVEU.☆41Updated last month
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 11 months ago
- Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model☆67Updated 6 months ago
- This repo contains the code for our TMLR paper: A Simple Video Segmenter by Tracking Objects Along Axial Trajectories☆27Updated 3 months ago
- Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model☆102Updated last week
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆31Updated 8 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆60Updated 4 months ago
- Unofficial implementation and experiments related to Set-of-Mark (SoM) 👁️☆86Updated last year
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction☆121Updated 3 months ago
- 3D Traffic Light & Sign Dataset☆19Updated 3 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆36Updated last year
- This repository is the project page for "Point Anywhere: Directed Object Estimation from Omnidirectional Images", including source code …☆11Updated last year
- ☆78Updated 9 months ago
- Multi-vision Sensor Perception and Reasoning (MS-PR) benchmark, assessing VLMs on their capacity for sensor-specific reasoning.☆16Updated 4 months ago
- ClickAttention: Click Region Similarity Guided Interactive Segmentation☆23Updated 6 months ago
- This repository holds the "Fully automated landmarking and facial segmentation on 3D photographs" files☆29Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆20Updated 3 months ago
- Simple program to manually caption your images (or any other file types) so you can use them for AI training☆37Updated 2 years ago
- EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [CVPR'24]☆27Updated 11 months ago
- Dataset and Code for CVSports at CVPR 2024 paper "AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements"☆44Updated last year
- EdgeSAM model for use with Autodistill.☆27Updated last year
- TensorFlow code for our ECCV'24 Workshop paper "LightAvatar: Efficient Head Avatar as Dynamic NeLF"☆29Updated 8 months ago