Latest Advances on Reasoning of Multimodal Large Language Models (Multimodal R1 \ Visual R1) ) 🍓
☆36Apr 3, 2025Updated last year
Alternatives and similar repositories for Awesome-MLLM-Reasoning
Users that are interested in Awesome-MLLM-Reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- BJTU计科专业部分课程学习记录☆15Dec 25, 2024Updated last year
- 北京交通大学 Beamer 主题(非官方)|Beamer Theme for Beijing Jiaotong University (unofficial)☆18Sep 16, 2022Updated 3 years ago
- TransformerLight: A Novel Sequence Modeling Based Traffic Signaling Mechanism via Gated Transformer (29th ACM SIGKDD)☆30Aug 28, 2023Updated 2 years ago
- [ICASSP 2020] CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition (A PyTorch implementation of Continuous Integrate-and-…☆79Jan 9, 2025Updated last year
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆22Jul 9, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- [EMNLP 2025] Distill Visual Chart Reasoning Ability from LLMs to MLLMs☆61Aug 25, 2025Updated 9 months ago
- ☆51Mar 20, 2026Updated 2 months ago
- 按照学校的顺序,发布每个学校的单独的夏令营。(包含交叉学科的)☆29Sep 6, 2024Updated last year
- ☆18Jun 10, 2023Updated 2 years ago
- Cost-Sensitive Toolpath Agent for Multi-turn Image Editing☆31Mar 26, 2025Updated last year
- ☆37Jun 28, 2021Updated 4 years ago
- Latest Advances on System-2 Reasoning☆1,352Jun 8, 2025Updated 11 months ago
- Official code for "Expression is enough: Improving traffic signal control with advanced traffic state representation ".☆51Feb 28, 2024Updated 2 years ago
- [CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆218Sep 26, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Recent Advances in Visual Dialog☆28Aug 19, 2022Updated 3 years ago
- KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques☆51Dec 9, 2024Updated last year
- ☆11Oct 20, 2022Updated 3 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- A fork to add multimodal model training to open-r1☆1,548Feb 8, 2025Updated last year
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆843May 14, 2025Updated last year
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆30Mar 18, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Awesome Entity Alignment is a collection of EA techniques, including papers, codes, and datasets.☆11Oct 27, 2022Updated 3 years ago
- [INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation☆41Sep 1, 2023Updated 2 years ago
- ☆12Jul 7, 2022Updated 3 years ago
- 电子病历结构化解析☆13May 11, 2022Updated 4 years ago
- ☆16Sep 6, 2023Updated 2 years ago
- CCF-CSP历年真题答案☆80Nov 26, 2025Updated 6 months ago
- ☆13Sep 25, 2024Updated last year
- Implementation of Unified Embedding: Battle-Tested Feature Representations for Web-Scale ML Systems☆15Nov 11, 2023Updated 2 years ago
- [CVPR2026] BinaryAttention: One-Bit QK-Attention for Vision and Diffusion Transformers☆33Mar 17, 2026Updated 2 months ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,412May 11, 2026Updated 2 weeks ago
- Visual Dialog: Light-weight Transformer for Many Inputs (ECCV 2020)☆29Aug 5, 2021Updated 4 years ago
- ☆13Jul 22, 2024Updated last year
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆43Mar 11, 2025Updated last year
- 数独的生成算法和解题算法☆11Jun 9, 2018Updated 7 years ago
- NAR-BERT-ASR☆10Sep 27, 2021Updated 4 years ago
- Offical respority for Gait Recogniton with Drones: A benchmark (TMM 2023)☆10Feb 2, 2024Updated 2 years ago