Latest Advances on Reasoning of Multimodal Large Language Models (Multimodal R1 \ Visual R1) ) 🍓
☆36Apr 3, 2025Updated 11 months ago
Alternatives and similar repositories for Awesome-MLLM-Reasoning
Users that are interested in Awesome-MLLM-Reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- R1-Vision: Let's first take a look at the image☆48Feb 16, 2025Updated last year
- BJTU计科专业部分课程学习记录☆14Dec 25, 2024Updated last year
- ☆22Nov 19, 2024Updated last year
- TransformerLight: A Novel Sequence Modeling Based Traffic Signaling Mechanism via Gated Transformer (29th ACM SIGKDD)☆31Aug 28, 2023Updated 2 years ago
- Accompanying repo for the DP2O paper accepted by AAAI 2024 main conference☆17Mar 28, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The github repository of paper "Understanding Differential Search Index for Text Retrieval" in ACL2023 Findings..☆16May 21, 2023Updated 2 years ago
- [ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection☆25May 18, 2023Updated 2 years ago
- [EMNLP 2025] Distill Visual Chart Reasoning Ability from LLMs to MLLMs☆59Aug 25, 2025Updated 7 months ago
- MM-Eureka V0 also called R1-Multimodal-Journey, Latest version is in MM-Eureka☆325Jun 21, 2025Updated 9 months ago
- ☆49Updated this week
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆36May 18, 2025Updated 10 months ago
- ☆18Jun 10, 2023Updated 2 years ago
- ☆37Jun 28, 2021Updated 4 years ago
- Latest Advances on System-2 Reasoning☆1,339Jun 8, 2025Updated 9 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- a file-based long-term memory agent skill☆23Dec 28, 2025Updated 2 months ago
- Official code for "Expression is enough: Improving traffic signal control with advanced traffic state representation ".☆50Feb 28, 2024Updated 2 years ago
- [CVPR 2026] MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources☆217Sep 26, 2025Updated 6 months ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- The code repo for paper "Multi-intersection Traffic Optimisation: ABenchmark Dataset and a Strong Baseline"☆11Mar 15, 2022Updated 4 years ago
- Recent Advances in Visual Dialog☆30Aug 19, 2022Updated 3 years ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- ☆11Oct 20, 2022Updated 3 years ago
- A fork to add multimodal model training to open-r1☆1,507Feb 8, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning☆48Nov 8, 2023Updated 2 years ago
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆846May 14, 2025Updated 10 months ago
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- Wind Turbine Blade Image Dateset☆13May 23, 2019Updated 6 years ago
- Combined InstantID🔥 and FouriScale to generate high resolution image!☆11Apr 3, 2024Updated last year
- 收集整理BJTU期末考试卷子、资料☆59Dec 30, 2023Updated 2 years ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,380Feb 26, 2026Updated 3 weeks ago
- ☆37Jan 31, 2024Updated 2 years ago
- ☆12Jul 22, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Codes for DATA: Differentiable ArchiTecture Approximation.☆11Jul 22, 2021Updated 4 years ago
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆43Mar 11, 2025Updated last year
- 数独的生成算法和解题算法☆11Jun 9, 2018Updated 7 years ago
- 用Kinect2.0读取图像的深度等信息,分割出手部图像。用HOG提取手部图像信息,接着用SVM进行训练。目的是为了识别手势。☆10Jan 8, 2020Updated 6 years ago
- ☆12Nov 28, 2022Updated 3 years ago
- Offical respority for Gait Recogniton with Drones: A benchmark (TMM 2023)☆10Feb 2, 2024Updated 2 years ago
- ☆67Sep 18, 2024Updated last year