Episoode / Double-BenchLinks
Official Code Repository for Double-Bench
☆21Updated 2 weeks ago
Alternatives and similar repositories for Double-Bench
Users that are interested in Double-Bench are comparing it to the libraries listed below
Sorting:
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 8 months ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆44Updated this week
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning…☆40Updated last month
- The code and data of We-Math 2.0.☆156Updated 3 weeks ago
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆25Updated 4 months ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mul…☆29Updated 2 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?☆36Updated 3 months ago
- ☆30Updated 2 months ago
- SFT+RL boosts multimodal reasoning☆32Updated 3 months ago
- EMPO, A Fully Unsupervised RLVR Method☆66Updated this week
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆24Updated 3 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆103Updated 3 months ago
- ☆39Updated 4 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)☆74Updated this week
- ☆49Updated 2 months ago
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆110Updated 2 months ago
- SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis☆66Updated 2 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆48Updated 4 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆16Updated last month
- Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆25Updated 3 months ago
- ☆29Updated last month
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 6 months ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆108Updated 4 months ago
- DELT: Data Efficacy for Language Model Training☆34Updated 3 weeks ago
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆79Updated last month
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆59Updated 11 months ago
- ☆35Updated last week
- VeriGUI: Verifiable Long-Chain GUI Dataset☆81Updated last month
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 3 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆35Updated 3 months ago