KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding
☆67Apr 5, 2026Updated 3 weeks ago
Alternatives and similar repositories for KARL
Users that are interested in KARL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆110May 27, 2025Updated 11 months ago
- This is the code repo for our paper "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression".☆13Feb 27, 2024Updated 2 years ago
- ☆27Jan 22, 2026Updated 3 months ago
- ☆33May 9, 2025Updated 11 months ago
- VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning☆36Jul 15, 2025Updated 9 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- [CIKM 2023 Oral] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Bia…☆40Mar 17, 2024Updated 2 years ago
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation☆142Sep 11, 2025Updated 7 months ago
- [CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…☆37Nov 25, 2025Updated 5 months ago
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 6 months ago
- Code for the Paper 'On the Connection Between Adversarial Robustness and Saliency Map Interpretability' by C. Etmann, S. Lunz, P. Maass, …☆16May 9, 2019Updated 6 years ago
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models (ICLR2026)☆22Mar 29, 2026Updated last month
- ☆16Jul 23, 2024Updated last year
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆72Oct 17, 2025Updated 6 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆115Dec 24, 2025Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [CVPR'25] Conformal prediction for vision-language models. Enhancing VLMs deployment with reliability gurarantees.☆21Jun 7, 2025Updated 10 months ago
- Code for paper: Reinforced Vision Perception with Tools☆72Oct 3, 2025Updated 6 months ago
- ☆37Sep 16, 2024Updated last year
- official implementation of "Med-Unic: unifying cross-lingual medical vision-language pre-training by diminishing bias"☆17Sep 22, 2023Updated 2 years ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆28Jul 15, 2025Updated 9 months ago
- [CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models☆240Nov 7, 2025Updated 5 months ago
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆773Sep 7, 2025Updated 7 months ago
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆118Apr 1, 2026Updated 3 weeks ago
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆11Oct 2, 2024Updated last year
- [ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…☆39Jun 30, 2024Updated last year
- Official Implementation of CODE☆16Sep 26, 2024Updated last year
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆112Sep 18, 2025Updated 7 months ago
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- Pruning the VLLMs☆105Dec 9, 2024Updated last year
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆89May 20, 2025Updated 11 months ago
- [ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…☆52Jul 3, 2024Updated last year
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆844May 14, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 【NeurIPS 2024】Dense Connector for MLLMs☆183Oct 14, 2024Updated last year
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆104Nov 9, 2024Updated last year
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆147Apr 9, 2025Updated last year
- ☆19Jun 10, 2025Updated 10 months ago
- ☆20Dec 31, 2020Updated 5 years ago
- MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"☆21Jul 15, 2024Updated last year
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"☆315Sep 28, 2025Updated 7 months ago