microsoft/MageBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/MageBench)

microsoft / MageBench

Official Repo for MageBench: Bridging Large Multimodal Models to Agents

☆22

Alternatives and similar repositories for MageBench

Users that are interested in MageBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

microsoft / Phi-Ground
View on GitHub
☆35May 12, 2026Updated 2 months ago
zhangmiaosen2000 / Towards-On-Policy-SFT
View on GitHub
☆19Mar 26, 2026Updated 3 months ago
zhangmiaosen2000 / Phi-Ground
View on GitHub
Home page for Microsoft Phi-Ground tech-report
☆22Sep 8, 2025Updated 10 months ago
eshoyuan / TrackGPT
View on GitHub
TrackGPT: Track What You Need in Videos via Text Prompts
☆25May 16, 2023Updated 3 years ago
HITsz-TMG / ICL-State-Vector
View on GitHub
☆12Jul 4, 2024Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Wizardcoast / Linear_Alignment
View on GitHub
This repo is reproduction resources for linear alignment paper, still working
☆17May 19, 2024Updated 2 years ago
Ablustrund / APPS_Plus
View on GitHub
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
☆73Aug 31, 2024Updated last year
Kamichanw / ICLTestbed
View on GitHub
An in-context learning research testbed
☆19Mar 16, 2025Updated last year
jgoerzen / nncp
View on GitHub
Debian packaging for NNCP [archived], moved to https://salsa.debian.org/go-team/packages/nncp
☆14Feb 18, 2023Updated 3 years ago
hustvl / WeakCLIP
View on GitHub
[IJCV 2024]
☆21Nov 11, 2024Updated last year
mm-vl / ULM-R1
View on GitHub
Co-Reinforcement Learning for Unified Multimodal Understanding and Generation
☆48Jul 22, 2025Updated last year
LittleGuoKe / AI-and-Person-ReID
View on GitHub
☆11Nov 10, 2023Updated 2 years ago
kyegomez / PaLM2-VAdapter
View on GitHub
Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…
☆17Nov 11, 2024Updated last year
Yijia-Xiao / LogicVista
View on GitHub
☆18Aug 1, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
WeiminXiong / RationaleCL
View on GitHub
Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)
☆12Oct 11, 2023Updated 2 years ago
sukjunhwang / set_classifier
View on GitHub
Official PyTorch implementation of: "Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in V…
☆14Aug 29, 2022Updated 3 years ago
cyzus / thoughtsculpt
View on GitHub
THOUGHTSCULPT, a general reasoning and search method for complex tasks
☆13Dec 13, 2024Updated last year
mbzuai-oryx / TrackingMeetsLMM
View on GitHub
☆10Apr 7, 2025Updated last year
chang-github-00 / Predictive-Decoding
View on GitHub
Repo for Anonymous purpose, pls don't distribute
☆10Oct 2, 2024Updated last year
Ahnsun / LTrack
View on GitHub
Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…
☆12Jul 26, 2023Updated 2 years ago
stogiannidis / srbench
View on GitHub
Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"
☆19Feb 1, 2026Updated 5 months ago
zoryzhang / referential-gaze
View on GitHub
The public reproducible analysis code used for the gaze project
☆11May 16, 2026Updated 2 months ago
chang-github-00 / LLM-Predictive-Decoding
View on GitHub
☆16Jul 9, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
adityagilra / archibrain
View on GitHub
Synthesize bio-plausible neural networks for cognitive tasks, mimicking brain architecture
☆11Apr 14, 2021Updated 5 years ago
taco-group / COCMT
View on GitHub
[IROS'25] COCMT
☆12Aug 14, 2025Updated 11 months ago
RuizeHan / CVMHT
View on GitHub
CVMHT : Complementary-View Multiple Human Tracking (AAAI 2020).
☆10Dec 9, 2021Updated 4 years ago
allenai / feb
View on GitHub
Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"
☆12Apr 27, 2022Updated 4 years ago
SafeAILab / RAIN
View on GitHub
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
☆99May 23, 2024Updated 2 years ago
VITA-Group / Robust_Weight_Signatures
View on GitHub
[ICML 2023] "Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?" by Ruisi Cai, Zhenyu Zhang, Zhangyang Wang
☆16May 4, 2023Updated 3 years ago
HKUDS / AutoMemory
View on GitHub
Build the memory your agent actually needs — automatically.
☆18Jul 3, 2026Updated 2 weeks ago
WolfgangKonen / GBG
View on GitHub
General Board Game Playing
☆25Jun 16, 2025Updated last year
weih527 / Pixel-Embedded-Affinity
View on GitHub
Learning to Model Pixel-Embedded Affinity for Homogeneous Instance Segmentation
☆13Jul 16, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
microsoft / MM-WebAgent
View on GitHub
Build coherent and visually polished multimodal webpages with hierarchical planning, AIGC tools, and iterative reflection.
☆15May 17, 2026Updated 2 months ago
EkdeepSLubana / MMC
View on GitHub
Codebase for Mechanistic Mode Connectivity
☆13Jul 14, 2023Updated 3 years ago
qingguo666 / FLO
View on GitHub
Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization
☆11Nov 29, 2022Updated 3 years ago
DYR1 / MoGU
View on GitHub
Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.
☆18Jan 14, 2025Updated last year
OpenRLHF / OpenRLHF-M
View on GitHub
An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.
☆163Apr 6, 2026Updated 3 months ago
Boyiliee / LLaDA-AV
View on GitHub
Driving Everywhere with Large Language Model Policy Adaptation
☆18Jul 4, 2024Updated 2 years ago
aerogjy / iCaps
View on GitHub
☆16May 13, 2022Updated 4 years ago