declare-lab / LLM-PuzzleTestLinks

This repository is maintained to release dataset and models for multimodal puzzle reasoning.

☆105

Alternatives and similar repositories for LLM-PuzzleTest

Users that are interested in LLM-PuzzleTest are comparing it to the libraries listed below

Sorting:

HZQ950419 / Math-LLaVA
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
☆91Updated last year
mathllm / MATH-V
[NeurIPS 2024] MATH-Vision dataset and code to measure multimodal mathematical reasoning capabilities.
☆116Updated 4 months ago
princeton-nlp / CharXiv
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
☆126Updated 5 months ago
mlfoundations / VisIT-Bench
☆50Updated last year
vlf-silkie / VLFeedback
☆100Updated last year
sail-sg / scaling-with-vocab
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
☆86Updated last year
MAmmoTH-VL / MAmmoTH-VL
(ACL 2025) MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
☆48Updated 4 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆120Updated 6 months ago
TIGER-AI-Lab / VL-Rethinker
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆155Updated 4 months ago
zwq2018 / Multi-modal-Self-instruct
The codebase for our EMNLP24 paper: Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Mo…
☆83Updated 8 months ago
TIGER-AI-Lab / MEGA-Bench
This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR 2025]
☆77Updated 3 months ago
DAMO-NLP-SG / multimodal_textbook
[ICCV 2025 Highlight] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
☆172Updated 6 months ago
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆142Updated last year
facebookresearch / multimodal_rewardbench
Multimodal RewardBench
☆53Updated 7 months ago
reka-ai / reka-vibe-eval
Multimodal language model benchmark, featuring challenging examples
☆176Updated 9 months ago
hkust-nlp / mstar
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆69Updated 2 months ago
ruixin31 / Spurious_Rewards
☆333Updated 2 months ago
yihedeng9 / STIC
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
☆70Updated last year
YiyangZhou / POVID
[Arxiv] Aligning Modalities in Vision Large Language Models via Preference Fine-tuning
☆88Updated last year
RUCAIBox / Virgo
Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*
☆109Updated 4 months ago
si0wang / ThinkLite-VL
☆101Updated 4 months ago
luka-group / mDPO
[EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.
☆81Updated 11 months ago
ReasoningTransfer / Transferability-of-LLM-Reasoning
☆98Updated last month
FreedomIntelligence / MLLM-Bench
MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria
☆71Updated 11 months ago
yunfeixie233 / ViGaL
☆59Updated 4 months ago
da03 / implicit_chain_of_thought
☆137Updated 11 months ago
GuanghaoYe / Emergence-of-Thinking
☆53Updated 7 months ago
SihengLi99 / SEALONG
Large Language Models Can Self-Improve in Long-context Reasoning
☆73Updated 10 months ago
TideDra / VL-RLHF
A RLHF Infrastructure for Vision-Language Models
☆184Updated 10 months ago
princeton-nlp / PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073
☆30Updated last year