visgym/VisGym

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/visgym/VisGym)

visgym / VisGym

Official Repository of VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

☆114

Alternatives and similar repositories for VisGym

Users that are interested in VisGym are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

para-lost / ECHO
View on GitHub
Echo: "Constantly Improving Image Models Need Constantly Improving Benchmarks" (ICLR 2026)
☆20Jan 29, 2026Updated 5 months ago
dynamic-lm / interrupt-lrm
View on GitHub
🔥 [ICML 2026] Official implementation of "Are LRMs Interruptible?"
☆18Jun 18, 2026Updated last month
Playful-RATs / RATs
View on GitHub
Implementation of paper "Playful Agentic Robot Learning"
☆99Jun 20, 2026Updated last month
yujunwei04 / UnSAMv2
View on GitHub
Code release for "UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity"
☆82Feb 1, 2026Updated 5 months ago
Kyunnilee / visual_puzzles
View on GitHub
🧩 Official code repository for “Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint.”
☆15Sep 22, 2025Updated 9 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
see-say-segment / sesame
View on GitHub
🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"
☆47Jun 16, 2024Updated 2 years ago
tsunghan-wu / reverse_vlm
View on GitHub
🔥 [NeurIPS 2025] Official implementation of "Generate, but Verify: Reducing Visual Hallucination in Vision-Language Models with Retrospe…
☆58Jan 22, 2026Updated 5 months ago
facebookresearch / threadweaver
View on GitHub
The implementation for ThreadWeaver Adaptive Threading for Efficient Parallel Reasoning in Language Models
☆67Apr 8, 2026Updated 3 months ago
Parallel-Reasoning / APR
View on GitHub
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆144Dec 17, 2025Updated 7 months ago
AntResearchNLP / ViLaSR
View on GitHub
[NeurIPS 2025] Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
☆98Jul 27, 2025Updated 11 months ago
para-lost / RVP
View on GitHub
Recursive Visual Programming (ECCV 2024)
☆18Nov 20, 2024Updated last year
visual-haystacks / mirage
View on GitHub
🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆27Feb 9, 2025Updated last year
Wakals / CoVT
View on GitHub
[ECCV 2026] Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"
☆376Apr 17, 2026Updated 3 months ago
DavidMChan / grazier
View on GitHub
A tool for calling (and calling out to) large language models.
☆16Aug 13, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
arijitray1993 / SAT
View on GitHub
Spatial Aptitude Training for Multimodal Langauge Models
☆33Feb 8, 2026Updated 5 months ago
StarTrail-org / RAG-DS-Serve
View on GitHub
[AAAI26]: DS SERVE: The Largest Open Vector Store over Pretain Data; A Framework for Efficient and Scalable Neural Retrieval
☆53Jan 28, 2026Updated 5 months ago
Hansxsourse / VRMDiff
View on GitHub
☆11Mar 11, 2025Updated last year
ESI-Bench / ESI-Bench
View on GitHub
☆116Updated this week
NVlabs / AutoGaze
View on GitHub
AutoGaze automatically removes redundant patches in a video, reducing #tokens in ViT/MLLM by 4x-100x.
☆297May 5, 2026Updated 2 months ago
solaris-wm / solaris-engine
View on GitHub
Scalable Minecraft multiplayer data collection engine
☆139Apr 23, 2026Updated 2 months ago
para-lost / AutoPresent
View on GitHub
Code for the paper "AutoPresent: Designing Structured Visuals From Scratch" (CVPR 2025)
☆174May 26, 2025Updated last year
mll-lab-nu / Theory-of-Space
View on GitHub
THEORY OF SPACE: a benchmark for evaluating whether foundation models can actively explore under partial observability efficiently to bui…
☆85Feb 27, 2026Updated 4 months ago
cheolhong0916 / contrastive-probing
View on GitHub
☆15Jun 19, 2026Updated last month
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zlab-princeton / 3d-gen-mem
View on GitHub
Code release for "Memorization in 3D Shape Generation: An Empirical Study"
☆21Dec 30, 2025Updated 6 months ago
TIGER-AI-Lab / VL-Rethinker
View on GitHub
The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]
☆189Jun 5, 2025Updated last year
NVlabs / Long-RL
View on GitHub
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
☆726Sep 24, 2025Updated 9 months ago
lisadunlap / VibeCheck
View on GitHub
Automated Qualitative Analysis of LLMs (ICLR 2025)
☆53Jul 6, 2025Updated last year
HorizonWind2004 / reconstruction-alignment
View on GitHub
[ICLR 2026] Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potenti…
☆410May 23, 2026Updated last month
mll-lab-nu / VAGEN
View on GitHub
World model reasoning RL for multi-turn VLM agents
☆488Jul 7, 2026Updated 2 weeks ago
mll-lab-nu / ViewAgent
View on GitHub
☆20Jul 3, 2026Updated 2 weeks ago
g-luo / dual_process
View on GitHub
Official PyTorch Implementation for Dual-Process Image Generation, ICCV 2025
☆133Aug 29, 2025Updated 10 months ago
waynchi / editbench
View on GitHub
☆31Apr 7, 2026Updated 3 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
multimodal-reasoning-lab / Bagel-Zebra-CoT
View on GitHub
https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT
☆137Jan 30, 2026Updated 5 months ago
UMass-Embodied-AGI / Mirage
View on GitHub
[CVPR 2026] Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens
☆293Aug 2, 2025Updated 11 months ago
TonyLianLong / CrossMAE
View on GitHub
Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders
☆135Apr 10, 2025Updated last year
FrontierCS / Frontier-CS
View on GitHub
A benchmark for evaluating LLMs on open-ended CS problems. Exploring the Next Frontier of Computer Science.
☆272Updated this week
visual-haystacks / vhs_benchmark
View on GitHub
🔥 [ICLR 2025] Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆44Nov 21, 2025Updated 8 months ago
lisadunlap / StringSight
View on GitHub
Automatically Analyze your Model Traces
☆45Mar 16, 2026Updated 4 months ago
bytedance / UniVR
View on GitHub
☆17Updated this week