Collect the awesome works evolved around reasoning models like O1/R1 in visual domain
☆53Jul 21, 2025Updated 7 months ago
Alternatives and similar repositories for awesome-deep-multimodal-reasoning
Users that are interested in awesome-deep-multimodal-reasoning are comparing it to the libraries listed below
Sorting:
- Checkpoints, logs and source code for AAAI-23 paper 'Data-Efficient Image Quality Assessment with Attention-Panel Decoder'☆39Apr 3, 2024Updated last year
- [ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models☆20Jul 17, 2024Updated last year
- ☆62Jan 20, 2026Updated last month
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆38Sep 8, 2024Updated last year
- ☆46Sep 27, 2025Updated 5 months ago
- Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models☆111Jul 7, 2025Updated 8 months ago
- We propose IAD-R1, a universal post-training framework that enhances Vision-Language Models for industrial anomaly detection through a tw…☆69Dec 9, 2025Updated 3 months ago
- 🚀 Vibe Stack - Docker setup for AI-powered coding with Vibe-Kanban + Claude Code | Secure secrets, browser-based VS Code, ready to deplo…☆42Feb 10, 2026Updated last month
- Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual in…☆1,346Feb 3, 2026Updated last month
- dify 知识库检索工具☆13Apr 3, 2025Updated 11 months ago
- ☆14Aug 28, 2024Updated last year
- [ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"☆10May 5, 2024Updated last year
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- ☆15Nov 11, 2024Updated last year
- Repository containing dataset, models and code associated with the CHIME project☆17Aug 22, 2024Updated last year
- About The corresponding code from our paper " Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning…☆13Jan 14, 2026Updated last month
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- Official repository for Robust Multimodal Large Language Models Against Modality Conflict☆16Jul 9, 2025Updated 8 months ago
- DocChecker: Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment☆15Jan 23, 2024Updated 2 years ago
- ☆14Sep 17, 2024Updated last year
- ☆11Aug 15, 2025Updated 6 months ago
- This code is for ChaLearn LAP Large-scale Continuous Gesture Recognition Challenge (Round 2) @ICCV 2017☆10Oct 21, 2017Updated 8 years ago
- ☆11Feb 2, 2026Updated last month
- Fine-tuning Llama2-7b and other llms for categorising emails for Deutsche Bahn (German National Railways)☆13Oct 9, 2023Updated 2 years ago
- 🕵️♂️🔊 Automatically update Audio Deepfake Detection (ADD) papers daily using GitHub Actions (updates every 12 hours)☆17Feb 13, 2026Updated 3 weeks ago
- Advances in recent large vision language models (LVLMs)☆15Sep 23, 2024Updated last year
- ☆11Aug 4, 2020Updated 5 years ago
- Dynamically parse and fill different formats of wav headers.☆11Jan 11, 2024Updated 2 years ago
- Official implementation of "EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy", MICCAI 2…☆11Jan 29, 2026Updated last month
- ☆28Jan 5, 2026Updated 2 months ago
- Speech Security and Privacy Compendium - Mini☆10Jun 18, 2024Updated last year
- awesome-audio-visual-robustness☆11Jan 27, 2024Updated 2 years ago
- calvis: Chest, wAist and peLVIS circumference from 3D human Body meshes for Deep Learning.☆11May 15, 2025Updated 9 months ago
- This is the pytorch implmentation of GACNet on S3DIS.☆14Jul 24, 2022Updated 3 years ago
- Object tracking based on SiamFC & DaSiamRPN using GOT-10k toolkit. Demo & Visualization.☆10Jun 29, 2020Updated 5 years ago
- Code for "Speaker Clustering using Dominant Sets", ICPR 2018☆11Nov 28, 2020Updated 5 years ago
- ☆16Jun 10, 2025Updated 9 months ago
- Generate a 3D BIM Model from 2D CAD Drawings☆12Nov 23, 2022Updated 3 years ago