FreedomIntelligence / MedGenLinks
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
☆18Updated last week
Alternatives and similar repositories for MedGen
Users that are interested in MedGen are comparing it to the libraries listed below
Sorting:
- Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1.☆25Updated 2 months ago
- [ACM MM25] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆80Updated last week
- [ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging☆36Updated last month
- ☆54Updated 4 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated 2 months ago
- SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆124Updated 2 months ago
- m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆35Updated 3 months ago
- ☆24Updated this week
- ☆83Updated 6 months ago
- The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…☆79Updated 5 months ago
- An Arena-style Automated Evaluation Benchmark for Detailed Captioning☆50Updated last month
- Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?☆24Updated 4 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆105Updated last month
- Doodling our way to AGI ✏️ 🖼️ 🧠☆72Updated last month
- [MM2024, oral] "Self-Supervised Visual Preference Alignment" https://arxiv.org/abs/2404.10501☆56Updated 11 months ago
- Preference Learning for LLaVA☆46Updated 8 months ago
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆45Updated 6 months ago
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 3 months ago
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in…☆144Updated this week
- [ICCV 2025] Dynamic-VLM☆21Updated 6 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆49Updated 2 months ago
- A Self-Training Framework for Vision-Language Reasoning☆80Updated 5 months ago
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆50Updated last month
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆20Updated last month
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆56Updated 8 months ago
- Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"☆42Updated this week
- This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vi…☆110Updated 3 weeks ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆24Updated 8 months ago
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants☆36Updated 2 months ago
- Code, Data and Model for Paper "Learning from Peers in Reasoning Models"☆24Updated 2 months ago