MSIIP / Connector-SLinks
☆13Updated 4 months ago
Alternatives and similar repositories for Connector-S
Users that are interested in Connector-S are comparing it to the libraries listed below
Sorting:
- ☆39Updated 5 months ago
- Facial Action Unit Detection Model and Visualization Canvas☆26Updated 2 weeks ago
- MedM-VL is a modular, LLaVA-based codebase for medical LVLMs.☆37Updated last week
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆159Updated 6 months ago
- ☆21Updated 2 months ago
- Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey☆788Updated last week
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆36Updated this week
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆153Updated 5 months ago
- [CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".☆285Updated last year
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆368Updated 8 months ago
- Official repository for VisionZip (CVPR 2025)☆341Updated last month
- Awesome papers & datasets specifically focused on long-term videos.☆305Updated 3 weeks ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆361Updated 6 months ago
- [CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding☆391Updated 3 months ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…☆901Updated this week
- [ICML'25 Spotlight] Catch Your Emotion: Sharpening Emotion Perception in Multimodal Large Language Models☆24Updated last week
- ☆58Updated 5 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆80Updated 2 weeks ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,127Updated this week
- [EMNLP 2024 Oral] MatchTime: Towards Automatic Soccer Game Commentary Generation☆82Updated 8 months ago
- This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages …☆684Updated last month
- Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"☆503Updated last month
- Code for Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning☆15Updated last year
- R1-like Video-LLM for Temporal Grounding☆115Updated 2 months ago
- ☆104Updated last month
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆12Updated last year
- ⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆197Updated last month
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆50Updated last month
- ☆13Updated 4 months ago
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆730Updated last month