ZJU-REAL/GSM8K-V

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZJU-REAL/GSM8K-V)

ZJU-REAL / GSM8K-V

GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts

☆40

Alternatives and similar repositories for GSM8K-V

Users that are interested in GSM8K-V are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ZJU-REAL / LAPO
View on GitHub
☆37Oct 9, 2025Updated 9 months ago
ZJU-REAL / Self-Braking-Tuning
View on GitHub
[NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604
☆54Nov 4, 2025Updated 8 months ago
ZJU-REAL / OmniEmbodied
View on GitHub
Benchmarking agent reasoning capabilities in physical interactions, tool usage, and multi-agent coordination.
☆45Aug 10, 2025Updated 10 months ago
ZJU-REAL / TimeHC-RL
View on GitHub
This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).
☆48Jun 4, 2025Updated last year
ZJU-REAL / cooper
View on GitHub
☆28Aug 19, 2025Updated 10 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ZJU-REAL / ViewSpatial-Bench
View on GitHub
[ECCV 2026] ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models
☆78Mar 9, 2026Updated 4 months ago
ZJU-REAL / GUI-RCPO
View on GitHub
[AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615
☆67Nov 8, 2025Updated 8 months ago
ZJU-REAL / SVGenius
View on GitHub
[ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139
☆78Nov 10, 2025Updated 7 months ago
ZJU-REAL / EasySteer
View on GitHub
A Unified Framework for High-Performance and Extensible LLM Steering
☆280Apr 30, 2026Updated 2 months ago
ZJU-REAL / SkillZero
View on GitHub
Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"
☆344May 20, 2026Updated last month
ZJU-REAL / KnowU-Bench
View on GitHub
Official code for "KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation"
☆72Jun 13, 2026Updated 3 weeks ago
yuanhongyi / zjucalc24
View on GitHub
Repository of Calculus (A) I Course Materials for the Autumn-Winter Semester of the 2024-2025 Academic Year at Zhejiang University.
☆10Jun 2, 2026Updated last month
inclusionAI / dFactory
View on GitHub
Easy and Efficient dLLM Fine-Tuning
☆262Mar 2, 2026Updated 4 months ago
ctrl-gaurav / LLMThinkBench
View on GitHub
An Advanced Basic Math Reasoning and Overthinking Evaluation Framework for LLMs
☆12Apr 20, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ZJU-REAL / GUI-G2
View on GitHub
[AAAI 2026] GUI-G²: Gaussian Reward Modeling for GUI Grounding
☆310Apr 15, 2026Updated 2 months ago
hwanyu112 / VIBE-Benchmark
View on GitHub
☆27Feb 3, 2026Updated 5 months ago
DongPoLI / NMS_SoftNMS
View on GitHub
Pytorch、Numpy实现NMS、Soft-NMS代码
☆12Mar 22, 2021Updated 5 years ago
ZJU-REAL / Awesome-GUI-Agents
View on GitHub
A curated collection of resources, tools, and frameworks for developing GUI Agents.
☆437Jun 2, 2026Updated last month
sugarandgugu / GaVaMoE
View on GitHub
code for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation
☆18Dec 7, 2024Updated last year
YutingLi0606 / Vision-Matters
View on GitHub
(ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning
☆60Sep 30, 2025Updated 9 months ago
wonderful9462 / IC-Former
View on GitHub
Code for "In-Context Former: Lightning-fast Compressing Context for Large Language Model" (Findings of EMNLP 2024)
☆21Nov 21, 2024Updated last year
Caoang327 / vis_det
View on GitHub
Code of "Visualizing and Understanding Object Detecor"
☆20Jun 24, 2021Updated 5 years ago
zwq2018 / Auto_star
View on GitHub
auto star for repo lists
☆10Aug 26, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
personqianduixue / comap_crawler
View on GitHub
美赛爬虫，美国大学生数学建模竞赛证书爬取及信息OCR识别分析
☆16Jun 25, 2022Updated 4 years ago
LAION-AI / laion50BU
View on GitHub
Un-*** 50 billions multimodality dataset
☆24Sep 14, 2022Updated 3 years ago
XiPotatonium / chatbot-webui
View on GitHub
一个支持跨模态大语言模型的webui. A chatbot webui that supports various multi-modal large language models
☆11May 8, 2023Updated 3 years ago
HumanMLLM / LOVE-R1
View on GitHub
Official repository of paper "LOVE-R1: Advancing Long Video Understanding with Adaptive Zoom-in Mechanism via Multi-Step Reasoning"
☆24Nov 1, 2025Updated 8 months ago
tigerbunny2023 / PROSE
View on GitHub
code for progressive gsl
☆12Jan 15, 2026Updated 5 months ago
LaVi-Lab / Video-3D-LLM
View on GitHub
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
☆216Jun 4, 2025Updated last year
Arking1995 / COHO
View on GitHub
[ECCV 2024 Oral] The official implementation of paper: COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
☆13Aug 13, 2024Updated last year
YennNing / Graph4MM
View on GitHub
[ICML 2025] Graph4MM: Weaving Multimodal Learning with Structural Information
☆28Jul 9, 2025Updated last year
nishiwen1214 / Benchmark-leakage-detection
View on GitHub
Official completion of “Training on the Benchmark Is Not All You Need”.
☆40Dec 31, 2024Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
RingBDStack / SE-GSL
View on GitHub
The code for "SE-GSL: A General and Effective Graph Structure Learning Framework through Structural Entropy Optimization"
☆15Mar 10, 2023Updated 3 years ago
gqk / RelayGS
View on GitHub
RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians
☆14Dec 5, 2024Updated last year
Zjer12 / Sub
View on GitHub
☆15Mar 2, 2023Updated 3 years ago
guanjinquan / CXRTrek
View on GitHub
Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning, release the dataset and the model weight
☆13May 26, 2025Updated last year
HDETR / H-PETR-3D
View on GitHub
[CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".
☆28Sep 1, 2022Updated 3 years ago
niopeng / PAPR-in-Motion
View on GitHub
Official implementation of "PAPR in Motion: Seamless Point-level 3D Scene Interpolation"
☆14Nov 6, 2024Updated last year
YZsZY / OctreeGS-Ubuntu-Viewer
View on GitHub
Rewrite the cmakefile to install and run it on Ubuntu
☆15Sep 11, 2024Updated last year