Official Repo for MageBench: Bridging Large Multimodal Models to Agents
☆22Jan 8, 2025Updated last year
Alternatives and similar repositories for MageBench
Users that are interested in MageBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- TrackGPT: Track What You Need in Videos via Text Prompts☆25May 16, 2023Updated 3 years ago
- ☆12Jul 4, 2024Updated last year
- HarnessX is a harness foundry: forge any number of agent harnesses from reusable processors and bundles, pair each with any model, and ev…☆109Jun 17, 2026Updated last week
- This repo is reproduction resources for linear alignment paper, still working☆17May 19, 2024Updated 2 years ago
- 🎮Manipulates mobile phones just like how you would. Official code for "MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficien…☆28Oct 10, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Jan 19, 2026Updated 5 months ago
- Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"☆19Feb 1, 2026Updated 4 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆73Aug 31, 2024Updated last year
- Agentic Keyframe Search for Video Question Answering☆18Apr 7, 2025Updated last year
- An in-context learning research testbed☆19Mar 16, 2025Updated last year
- ☆10Apr 7, 2025Updated last year
- An automated data pipeline scaling RL to pretraining levels☆77Jun 2, 2026Updated 3 weeks ago
- [IJCV 2024]☆21Nov 11, 2024Updated last year
- ☆16Apr 8, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"