Official Repo for MageBench: Bridging Large Multimodal Models to Agents
☆22Jan 8, 2025Updated last year
Alternatives and similar repositories for MageBench
Users that are interested in MageBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Jul 4, 2024Updated last year
- Training DIAMOND to play MarioKart64 in a Neural Network.☆30Sep 9, 2025Updated 8 months ago
- Agentic Keyframe Search for Video Question Answering☆18Apr 7, 2025Updated last year
- An in-context learning research testbed☆19Mar 16, 2025Updated last year
- ☆10Apr 7, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- [IJCV 2024]☆21Nov 11, 2024Updated last year
- ☆16Apr 8, 2026Updated last month
- ☆25Aug 9, 2025Updated 9 months ago
- [CVPR'25] Official code of paper "Mimic In-Context Learning for Multimodal Tasks"☆26Mar 10, 2026Updated 2 months ago
- Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)☆12Oct 11, 2023Updated 2 years ago
- ☆13Jun 13, 2025Updated 11 months ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- Official PyTorch implementation of: "Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in V…☆14Aug 29, 2022Updated 3 years ago
- THOUGHTSCULPT, a general reasoning and search method for complex tasks☆13Dec 13, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ECCV 2024] Official Implementation of CoPT: Unsupervised Domain Adaptive Segmentation using Domain-Agnostic Text Embeddings☆11Feb 24, 2025Updated last year
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- Synthesize bio-plausible neural networks for cognitive tasks, mimicking brain architecture☆11Apr 14, 2021Updated 5 years ago
- The public reproducible analysis code used for the gaze project☆10Feb 21, 2026Updated 2 months ago
- CVMHT : Complementary-View Multiple Human Tracking (AAAI 2020).☆10Dec 9, 2021Updated 4 years ago
- The official implementation of the TIP 2025 paper UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Netwo…☆15Jun 16, 2025Updated 11 months ago
- Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"☆13Apr 27, 2022Updated 4 years ago
- Codebase for Mechanistic Mode Connectivity☆13Jul 14, 2023Updated 2 years ago
- ECCV 2024 DTC Dataset Tooling☆22Jan 12, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official repository for the experiments included in the paper titled "Patch-level Routing in Mixture-of-Experts is Provably Sample-ef…☆14Feb 12, 2026Updated 3 months ago
- Repo for Anonymous purpose, pls don't distribute☆10Oct 2, 2024Updated last year
- [IROS'25] COCMT☆12Aug 14, 2025Updated 9 months ago
- ☆15Jul 9, 2025Updated 10 months ago
- Tight Mutual Information Estimation With Contrastive Fenchel-Legendre Optimization☆11Nov 29, 2022Updated 3 years ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆97May 23, 2024Updated last year
- ☆16May 13, 2022Updated 4 years ago
- This fully reconfigurable action, validates conformity with Azure Developer CLI template standards.☆22Apr 8, 2026Updated last month
- Automatically replace full publication names in a bibtex database file into official abbreviated names, or reverse. (Support IEEE/ACM/Sci…☆14Jul 30, 2024Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆162Apr 6, 2026Updated last month
- Dataset Pinocchio for paper "Towards Understanding Factual Knowledge of Large Language Models" accepted by ICLR 2024 (Spotlight)☆12Mar 13, 2024Updated 2 years ago
- VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments☆23Sep 30, 2025Updated 7 months ago
- Our research proposes a novel MoGU framework that improves LLMs' safety while preserving their usability.☆18Jan 14, 2025Updated last year
- Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts☆16Feb 26, 2024Updated 2 years ago
- Multi agent gym environment based on the classic Snake game with implementations of various reinforcement learning algorithms in pytorch☆15Jun 21, 2022Updated 3 years ago
- Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models (ACL-Findings 2024)☆16Apr 23, 2024Updated 2 years ago