Latest Advances on (RL based) Multimodal Reasoning and Generation in Multimodal Large Language Models
☆47Oct 30, 2025Updated 4 months ago
Alternatives and similar repositories for Awesome-Multimodal-Reasoning
Users that are interested in Awesome-Multimodal-Reasoning are comparing it to the libraries listed below
Sorting:
- Official repository of "TDSD: Text-Driven Scene-Decoupled Weakly Supervised Video Anomaly Detection"☆11May 25, 2025Updated 9 months ago
- ☆10Nov 27, 2024Updated last year
- [ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models☆24Jan 1, 2026Updated 2 months ago
- ✨First Open-Source R1-like Video-LLM [2025/02/18]☆382Feb 23, 2025Updated last year
- R1-like Video-LLM for Temporal Grounding☆134Jun 20, 2025Updated 8 months ago
- [Arxiv 2025] In-Video Instructions: Visual Signals as Generative Control☆46Nov 25, 2025Updated 3 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆54Mar 9, 2025Updated last year
- spatio-temporal tasks☆16Jul 15, 2024Updated last year
- ☆31Jan 30, 2026Updated last month
- ☆36Jun 30, 2025Updated 8 months ago
- MR. Video: MapReduce is the Principle for Long Video Understanding☆31Apr 23, 2025Updated 10 months ago
- ☆111Sep 11, 2025Updated 5 months ago
- [CVPR 2025 Oral] VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection☆138Jul 28, 2025Updated 7 months ago
- Latest Advances on Modality Priors in Multimodal Large Language Models☆30Dec 10, 2025Updated 3 months ago
- Recent Advances on MLLM's Reasoning Ability☆26Apr 11, 2025Updated 10 months ago
- Awesome papers & datasets specifically focused on long-term videos.☆355Oct 9, 2025Updated 5 months ago
- [NIPS2025] VideoChat-R1 & R1.5: Enhancing Spatio-Temporal Perception and Reasoning via Reinforcement Fine-Tuning☆260Oct 18, 2025Updated 4 months ago
- [CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga☆145Jan 19, 2026Updated last month
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆29Sep 27, 2024Updated last year
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆25May 26, 2025Updated 9 months ago
- A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.☆73Mar 18, 2025Updated 11 months ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,365Feb 26, 2026Updated last week
- ☆41Sep 9, 2025Updated 6 months ago
- A CNN-BiLSTM model for Li-ion battery state of health and remaining useful life prediction☆11Mar 25, 2024Updated last year
- (ACL 2025 Main) Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillat…☆34Aug 23, 2025Updated 6 months ago
- ☆21May 19, 2025Updated 9 months ago
- [ICCV2023] Unsupervised Surface Anomaly Detection with Diffusion Probabilistic Model☆37May 6, 2024Updated last year
- PyTorch code for "Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training"☆39Mar 4, 2024Updated 2 years ago
- LinVT: Empower Your Image-level Large Language Model to Understand Videos☆84Dec 30, 2024Updated last year
- 🔥🔥🔥 Latest Papers, Codes and Datasets on Video-LMM Post-Training☆263Mar 3, 2026Updated last week
- Project that regroup the state-of-the-art knowledge distillation approaches for unsupervised anomaly detection☆13Oct 10, 2025Updated 5 months ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆79Oct 29, 2025Updated 4 months ago
- [AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks☆11Jun 19, 2025Updated 8 months ago
- build vgg16 with pytorch 0.4.0 for classification of CIFAR datasets☆10Mar 31, 2019Updated 6 years ago
- Code for the paper: "T-shape data and probabilistic remaining useful life prediction for Li-ion batteries using multiple non-crossing qua…☆10Aug 4, 2023Updated 2 years ago
- Documentation at☆14Mar 27, 2025Updated 11 months ago
- ☆13Jan 2, 2025Updated last year
- ICCV 2025 Code for "Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection"☆11Nov 26, 2025Updated 3 months ago
- 基于卷积自编码器和图像金字塔的布料缺陷无监督学习与检测方法☆10Jun 28, 2023Updated 2 years ago