FreedomIntelligence / MedGenLinks
MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
☆28Updated 6 months ago
Alternatives and similar repositories for MedGen
Users that are interested in MedGen are comparing it to the libraries listed below
Sorting:
- [ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging☆38Updated 7 months ago
- Encourage Medical LLM to engage in deep thinking similar to DeepSeek-R1.☆26Updated 8 months ago
- ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning☆107Updated 2 months ago
- [ACM MM 2025] The official code of "Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs"☆97Updated last month
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Updated last year
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated 8 months ago
- ☆41Updated 5 months ago
- [EMNLP 2025] Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards☆55Updated 3 months ago
- CLIP-MoE: Mixture of Experts for CLIP☆51Updated last year
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆62Updated last year
- MedEvalKit: A Unified Medical Evaluation Framework☆198Updated 2 months ago
- MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants☆41Updated 3 months ago
- [TMLR 25] SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models☆146Updated 3 months ago
- DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding☆65Updated 7 months ago
- GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.☆79Updated last year
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆51Updated 5 months ago
- [NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context☆168Updated last year
- The official repo for LIFT: Language-Image Alignment with Fixed Text Encoders☆41Updated 7 months ago
- [ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆48Updated 3 weeks ago
- Citrus-V: Advancing Medical Foundation Models with Unified Medical Image Grounding for Clinical Reasoning☆18Updated 3 months ago
- ☆39Updated 11 months ago
- Code for paper: Reinforced Vision Perception with Tools☆68Updated 3 months ago
- Official implement of MIA-DPO☆70Updated 11 months ago
- Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"☆72Updated 2 months ago
- ☆56Updated last month
- [ECCV 2024] FlexAttention for Efficient High-Resolution Vision-Language Models☆46Updated last year
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want☆92Updated last month
- [ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning☆50Updated last year
- Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"☆57Updated last month
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆73Updated 2 months ago