MSIIP / Connector-SLinks
☆13Updated 5 months ago
Alternatives and similar repositories for Connector-S
Users that are interested in Connector-S are comparing it to the libraries listed below
Sorting:
- ☆39Updated 6 months ago
- MedM-VL is a modular, LLaVA-based codebase for medical LVLMs.☆44Updated last week
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆155Updated 7 months ago
- This repository is the official implementation of "Look-Back: Implicit Visual Re-focusing in MLLM Reasoning".☆64Updated 3 months ago
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆89Updated last week
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆168Updated 7 months ago
- Official repository for VisionZip (CVPR 2025)☆358Updated 3 months ago
- Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering [ACM MM'24]☆12Updated last year
- 🌟 手把手教你在论文中插入代码链接☆22Updated 2 months ago
- ☆31Updated 4 months ago
- ☆16Updated 6 months ago
- Official implementation of MC-LLaVA.☆140Updated last month
- Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models☆39Updated last week
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆313Updated last week
- [Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought …☆383Updated 9 months ago
- [CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".☆290Updated last year
- Facial Action Unit Detection Model and Visualization Canvas☆29Updated 2 months ago
- Collections of Papers and Projects for Multimodal Reasoning.☆105Updated 5 months ago
- [CVPR 2025 (Oral)] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆82Updated 3 weeks ago
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆274Updated 6 months ago
- Code repository for "Post-pre-training for Modality Alignment in Vision-Language Foundation Models" (CVPR2025)☆29Updated 2 months ago
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…☆1,033Updated 2 weeks ago
- [CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding☆324Updated last year
- [NeurIPS'23] ODE-based Recurrent Model-free Reinforcement Learning for POMDPs☆15Updated 5 months ago
- ☆58Updated 7 months ago
- [ICML'25 Spotlight] Catch Your Emotion: Sharpening Emotion Perception in Multimodal Large Language Models☆36Updated last month
- Official PyTorch repository for GRAM☆96Updated 5 months ago
- Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models☆22Updated 3 weeks ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆368Updated 7 months ago
- A collection of multimodal reasoning papers, codes, datasets, benchmarks and resources.☆317Updated last month