zelaix/VS-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zelaix/VS-Bench)

zelaix / VS-Bench

VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments

☆25

Alternatives and similar repositories for VS-Bench

Users that are interested in VS-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thu-uav / VolleyBots
View on GitHub
Code release for "A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play" (NeurIPS 2025), https://arxiv.or…
☆63Mar 2, 2026Updated 4 months ago
thu-nics / MARSHAL
View on GitHub
[ICLR'26] MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
☆54Apr 17, 2026Updated 3 months ago
thu-uav / HCSP
View on GitHub
Code release for "Mastering Multi-Drone Volleyball through Hierarchical Co-Self-Play Reinforcement Learning" (CoRL 2025), https://arxiv.o…
☆24Feb 26, 2026Updated 4 months ago
ethanhe42 / dds
View on GitHub
DDS: Delta Denoising Score PyTorch implementation
☆19Sep 2, 2023Updated 2 years ago
HaohanZou / CoNSAL
View on GitHub
Official implementation of CoNSAL for analytical Lyapunov function discovery
☆12Jun 26, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
thu-uav / FlightBench
View on GitHub
☆47Apr 8, 2025Updated last year
yuexy / ST-AR
View on GitHub
☆14Sep 22, 2025Updated 9 months ago
wenhuang2000 / VHTest
View on GitHub
VHTest
☆16Oct 31, 2024Updated last year
Aurora-slz / MM-Verify
View on GitHub
☆19Oct 28, 2025Updated 8 months ago
lmgame-org / GRL
View on GitHub
Multi-Turn RL Training System with AgentTrainer for Language Model Game Reinforcement Learning
☆65Dec 18, 2025Updated 7 months ago
xlyu0106 / MACT
View on GitHub
☆19Jul 31, 2025Updated 11 months ago
PeixianChen / citation-count
View on GitHub
基于 Google Scholar 的论文他引次数统计。
☆14Dec 8, 2022Updated 3 years ago
thu-nics / R2R
View on GitHub
[NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…
☆94Apr 7, 2026Updated 3 months ago
ripl / statler
View on GitHub
The official repository for the paper "Statler: State-Maintaining Language Models for Embodied Reasoning"
☆13Jun 10, 2024Updated 2 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
thu-uav / SimpleFlight
View on GitHub
What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study
☆103Jun 11, 2025Updated last year
zeyofu / Commonsense-T2I
View on GitHub
Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
☆24Aug 13, 2024Updated last year
facebookresearch / multimodal_rewardbench
View on GitHub
Multimodal RewardBench
☆68Feb 21, 2025Updated last year
diaoquesang / GL-LCM
View on GitHub
[MICCAI 2025] GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images
☆17Mar 12, 2026Updated 4 months ago
Sueqk / LMM-VQA
View on GitHub
LMM for VQA, tcsvt version
☆10Jul 19, 2024Updated 2 years ago
imagination-research / LCSC
View on GitHub
[ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
☆16Feb 15, 2025Updated last year
mwufi / meta-rl-bandits
View on GitHub
A simple RNN meta-learner
☆10Dec 17, 2018Updated 7 years ago
OpenGVLab / MM-NIAH
View on GitHub
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…
☆126Nov 25, 2024Updated last year
Infini-AI-Lab / APE
View on GitHub
☆38Feb 12, 2025Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mlbio-epfl / LaMer
View on GitHub
[ICLR 2026] Meta-RL Induces Exploration in Language Agents
☆45Feb 1, 2026Updated 5 months ago
ylsung / vl-merging
View on GitHub
PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"
☆37Oct 11, 2023Updated 2 years ago
wjxts / RegularizedBN
View on GitHub
☆21Dec 30, 2022Updated 3 years ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated last year
hkust-nlp / RL-Verifier-Robustness
View on GitHub
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆24Oct 7, 2025Updated 9 months ago
jiaangli / VILA
View on GitHub
[TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Nov 22, 2024Updated last year
yunxiangfu2001 / LaMamba-Diff
View on GitHub
LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba (Official Implementation)
☆17Oct 24, 2024Updated last year
HITSZ-MAS / STORM
View on GitHub
☆18Aug 17, 2025Updated 11 months ago
microsoft / DKI_LLM
View on GitHub
This is a repository for DKI group concerning the LLM-related papers alongside with code.
☆40May 20, 2026Updated 2 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
David-Li0406 / Preference-Leakage
View on GitHub
☆55May 22, 2025Updated last year
walkerning / nics_fix_pytorch
View on GitHub
pytorch fixed point training tool/framework
☆34Oct 14, 2020Updated 5 years ago
UCLA-VAST / heterohalide
View on GitHub
HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration
☆15Sep 14, 2020Updated 5 years ago
EternityYW / Gemini-Commonsense-Evaluation
View on GitHub
Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"
☆38Jan 3, 2024Updated 2 years ago
kyegomez / PaLM2-VAdapter
View on GitHub
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…
☆17Nov 11, 2024Updated last year
rhfeiyang / Opt-In-Art
View on GitHub
Official implementation of "Opt-In Art: Learning Art Styles Only from Few Examples" (Accepted by NeurIPS 2025)
☆33Nov 30, 2025Updated 7 months ago
hmatsu1226 / SCOUP
View on GitHub
SCOUP is a probabilistic model to analyze single-cell expression data during differentiation
☆10Apr 20, 2017Updated 9 years ago