jyrao / SoccerAgentLinks
[ACM Multimedia 2025] "Multi-Agent System for Comprehensive Soccer Understanding"
☆57Updated last month
Alternatives and similar repositories for SoccerAgent
Users that are interested in SoccerAgent are comparing it to the libraries listed below
Sorting:
- [ACL2025 Oral & Award] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible☆113Updated 4 months ago
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆87Updated 11 months ago
- [ICLR 2025] SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models☆15Updated 2 months ago
- [CVPR 2025] "Towards Universal Soccer Video Understanding".☆201Updated 3 months ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆269Updated 4 months ago
- This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)☆276Updated last year
- 💡 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning☆286Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆123Updated 4 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆229Updated last month
- Pixel-Level Reasoning Model trained with RL [NeuIPS25]☆254Updated last month
- [ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆182Updated 8 months ago
- Long Context Transfer from Language to Vision☆398Updated 8 months ago
- Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024]☆235Updated 8 months ago
- An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"☆149Updated last month
- [ICML 2025] VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models☆37Updated 6 months ago
- [ACL 2025 🔥] Rethinking Step-by-step Visual Reasoning in LLMs☆310Updated 6 months ago
- [ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"☆151Updated last year
- Reinforcement Learning of Vision Language Models with Self Visual Perception Reward☆149Updated 2 months ago
- [EMNLP 2025] Distill Visual Chart Reasoning Ability from LLMs to MLLMs☆57Updated 3 months ago
- ☆68Updated 3 months ago
- A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.☆162Updated 10 months ago
- (ICCV2025) Official repository of paper "ViSpeak: Visual Instruction Feedback in Streaming Videos"☆41Updated 5 months ago
- Official implementation of paper AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding☆88Updated 7 months ago
- VCode: SVG as Symbolic Visual Representation☆114Updated 3 weeks ago
- [ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆79Updated 9 months ago
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆73Updated last year
- Code for the paper "AutoPresent: Designing Structured Visuals From Scratch" (CVPR 2025)☆144Updated 6 months ago
- Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding☆291Updated 4 months ago
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆211Updated 11 months ago
- ✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehensi…☆363Updated last month