thunlp/KARL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/thunlp/KARL)

thunlp / KARL

KARL: Knowledge-Aware Reasoning and Reinforcement Learning for Knowledge-Intensive Visual Grounding

☆68

Alternatives and similar repositories for KARL

Users that are interested in KARL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RUCAIBox / Virgo
View on GitHub
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆110May 27, 2025Updated last year
OpenMatch / Gist-COCO
View on GitHub
This is the code repo for our paper "Say More with Less: Understanding Prompt Learning Behaviors through Gist Compression".
☆13Feb 27, 2024Updated 2 years ago
ZhentingWang / DUMP
View on GitHub
☆33May 9, 2025Updated last year
zhishuifeiqian / VCR-Bench
View on GitHub
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
☆37May 9, 2026Updated 2 months ago
ls-kelvin / REVPT
View on GitHub
Code for paper: Reinforced Vision Perception with Tools
☆74Oct 3, 2025Updated 9 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
OpenMatch / TASTE
View on GitHub
[CIKM 2023 Oral] This is the code repo for our CIKM‘23 paper "Text Matching Improves Sequential Recommendation by Reducing Popularity Bia…
☆40Mar 17, 2024Updated 2 years ago
cetmann / robustness-interpretability
View on GitHub
Code for the Paper 'On the Connection Between Adversarial Robustness and Saliency Map Interpretability' by C. Etmann, S. Lunz, P. Maass, …
☆16May 9, 2019Updated 7 years ago
jmhb0 / microvqa
View on GitHub
[CVPR 2025] MicroVQA eval and 🤖RefineBot code for "MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research"…
☆36Nov 25, 2025Updated 7 months ago
SHI-Labs / VisPer-LM
View on GitHub
[NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
☆73Oct 17, 2025Updated 9 months ago
hulianyuyy / iLLaVA
View on GitHub
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models (ICLR2026)
☆23Jun 24, 2026Updated 3 weeks ago
ZhangXJ199 / TinyLLaVA-Video-R1
View on GitHub
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
☆116Dec 24, 2025Updated 6 months ago
jusiro / CLIP-Conformal
View on GitHub
[CVPR'25] Conformal prediction for vision-language models. Enhancing VLMs deployment with reliability gurarantees.
☆21Jun 7, 2025Updated last year
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated last year
rxtan2 / Koala-video-llm
View on GitHub
☆37Sep 16, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dongyh20 / Insight-V
View on GitHub
[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
☆240Nov 7, 2025Updated 8 months ago
ModalMinds / MM-EUREKA
View on GitHub
MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning
☆770Sep 7, 2025Updated 10 months ago
SUSTechBruce / Med-UniC
View on GitHub
official implementation of "Med-Unic: unifying cross-lingual medical vision-language pre-training by diminishing bias"
☆18Sep 22, 2023Updated 2 years ago
kodenii / ORES
View on GitHub
ORES: Open-vocabulary Responsible Visual Synthesis
☆14Dec 12, 2023Updated 2 years ago
adobe-research / llava-score
View on GitHub
☆11Oct 2, 2024Updated last year
UCSB-AI / MMWorld
View on GitHub
Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"
☆28Jul 15, 2025Updated last year
daeunni / Video-Skill-CoT
View on GitHub
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Findings]"
☆18Aug 27, 2025Updated 10 months ago
OpenMatch / MARVEL
View on GitHub
[ACL 2024 Oral] This is the code repo for our ACL‘24 paper "MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Mo…
☆39Jun 30, 2024Updated 2 years ago
mbzuai-oryx / Video-CoM
View on GitHub
Video-CoM: Interactive Video Reasoning via Chain of Manipulations
☆22Jun 17, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
IVY-LVLM / CODE
View on GitHub
Official Implementation of CODE
☆17Sep 26, 2024Updated last year
ZhangAIPI / YOPO_MLLM_Pruning
View on GitHub
Pruning the VLLMs
☆106Dec 9, 2024Updated last year
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
thunlp / Migician
View on GitHub
[ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
☆90May 20, 2025Updated last year
shuo-git / InfECE
View on GitHub
☆20Dec 31, 2020Updated 5 years ago
TideDra / lmm-r1
View on GitHub
Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.
☆847May 14, 2025Updated last year
OpenMatch / UniVL-DR
View on GitHub
[ICLR 2023] This is the code repo for our ICLR‘23 paper "Universal Vision-Language Dense Retrieval: Learning A Unified Representation Spa…
☆52Jul 3, 2024Updated 2 years ago
SUSTechBruce / LOOK-M
View on GitHub
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆103Nov 9, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GAIR-NLP / MAYE
View on GitHub
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme
☆149Apr 9, 2025Updated last year
Haochen-Wang409 / TreeVGR
View on GitHub
[ICLR'26] Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
☆92Jan 26, 2026Updated 5 months ago
HJYao00 / DenseConnector
View on GitHub
【NeurIPS 2024】Dense Connector for MLLMs
☆183Oct 14, 2024Updated last year
jin-s13 / MMPD-Dataset
View on GitHub
MMPD Dataset from ECCV'2024 "When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset"
☆21Jul 15, 2024Updated 2 years ago
Sliver-g / Cardiac-CLIP
View on GitHub
☆27Jan 22, 2026Updated 5 months ago
jylins / hourllava
View on GitHub
[NeurIPS 2025 Spotlight] Unleashing Hour-Scale Video Training for Long Video-Language Understanding
☆19Jun 24, 2025Updated last year
HumanEval-V / HumanEval-V-Benchmark
View on GitHub
A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks
☆15Feb 25, 2025Updated last year