RAIVNLab/mnms

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RAIVNLab/mnms)

RAIVNLab / mnms

m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks

☆44

Alternatives and similar repositories for mnms

Users that are interested in mnms are comparing it to the libraries listed below

Sorting:

RAIVNLab / CREPE
View on GitHub
[CVPR23 Highlight] CREPE: Can Vision-Language Foundation Models Reason Compositionally?
☆35Apr 27, 2023Updated 2 years ago
JieyuZ2 / ProVision
View on GitHub
A instruction data generation system for multimodal language models.
☆35Jan 31, 2025Updated last year
SalesforceAIResearch / LATTE
View on GitHub
☆68Sep 15, 2025Updated 5 months ago
JieyuZ2 / TaskMeAnything
View on GitHub
[NeurIPS 2024] A task generation and model evaluation system for multimodal language models.
☆73Nov 27, 2024Updated last year
RAIVNLab / sugar-crepe
View on GitHub
[NeurIPS 2023] A faithful benchmark for vision-language compositionality
☆89Feb 13, 2024Updated 2 years ago
k1rezaei / Text-to-concept
View on GitHub
☆35Feb 5, 2024Updated 2 years ago
JHU-CLSP / turking-bench
View on GitHub
Web-grounded natural language instructions
☆18Nov 25, 2024Updated last year
magicgh / Self-MAP
View on GitHub
[ACL 2024] On the Multi-turn Instruction Following for Conversational Web Agents
☆17Oct 12, 2024Updated last year
JoeYing1019 / UltraTool
View on GitHub
[ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
☆70Aug 5, 2025Updated 6 months ago
elehcimd / mltraq
View on GitHub
Track and Collaborate on ML & AI Experiments.
☆44Mar 10, 2025Updated 11 months ago
pleaseconnectwifi / DANCE
View on GitHub
PyTorch code for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles (DANCE)
☆23Nov 29, 2022Updated 3 years ago
HowieHwong / MetaTool
View on GitHub
[ICLR 2024] MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
☆110Mar 21, 2024Updated last year
weikaih04 / Synthetic-Detection-Segmentation-Grounding-Data
View on GitHub
[CVPR 2026] An accurate and dense-annotated synthetic dataset for training SOTA detectors / segmentors / Grounding-VLMs.
☆100Updated this week
allenai / close
View on GitHub
☆59Aug 30, 2023Updated 2 years ago
jimexist / surya-rs
View on GitHub
Rust implementation of Surya
☆65Mar 1, 2025Updated 11 months ago
Dongping-Chen / ISG
View on GitHub
(ICLR 2025 Spotlight) Official code repository for Interleaved Scene Graph.
☆31Aug 7, 2025Updated 6 months ago
umd-huang-lab / Mementos
View on GitHub
☆32Feb 8, 2024Updated 2 years ago
yuhui-zh15 / C3
View on GitHub
Official implementation of "Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data" (ICLR 2024)
☆34Oct 16, 2024Updated last year
llms-heart-mir / tutorial
View on GitHub
☆38Jun 16, 2024Updated last year
jylee425 / b-moca
View on GitHub
Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)
☆35Jul 21, 2025Updated 7 months ago
yonatanbitton / wysiwyr
View on GitHub
☆37Oct 7, 2023Updated 2 years ago
ggjy / vision_weak_to_strong
View on GitHub
☆38Feb 8, 2024Updated 2 years ago
ChenYutongTHU / Learning-to-manipulate-individual-objects-in-an-image-Implementation
View on GitHub
[CVPR 2020] A generative model with latent factors that are independent and localized.
☆12Mar 27, 2025Updated 11 months ago
microsoft / ToolTalk
View on GitHub
Evaluating tool-augmented LLMs in conversation settings
☆89May 31, 2024Updated last year
codezakh / LilT
View on GitHub
[ICLR 23] Contrastive Aligned of Vision to Language Through Parameter-Efficient Transfer Learning
☆40Jul 29, 2023Updated 2 years ago
sieve-community / describe
View on GitHub
Incredibly descriptive audiovisual summaries for videos
☆41Aug 2, 2024Updated last year
hasika000 / xvtp3d
View on GitHub
☆11May 18, 2023Updated 2 years ago
unitn-drive / thesis
View on GitHub
Thesis Template
☆10Jan 26, 2026Updated last month
codezakh / exploiting-BERT-thru-translation
View on GitHub
[ACM MM 2021 Oral] Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation"
☆39Aug 8, 2021Updated 4 years ago
MikeGu721 / AgentGroup
View on GitHub
☆96Mar 26, 2024Updated last year
JiwanChung / VisArgs
View on GitHub
Corpus to accompany: "Selective Vision is the Challenge for Visual Reasoning: A Benchmark for Visual Argument Understanding"
☆11Apr 11, 2025Updated 10 months ago
OpenCausaLab / MORE
View on GitHub
☆15Jan 9, 2026Updated last month
SNH48Live / KVM48
View on GitHub
The Koudai48 VOD Manager
☆10May 2, 2019Updated 6 years ago
ananyahjha93 / libself
View on GitHub
PyTorch Lightning based framework to run experiments for self-supervised learning tasks.
☆10Feb 14, 2020Updated 6 years ago
alumik / cail2019
View on GitHub
CAIL2019-SCM: Similar case matching in legal domain
☆10May 10, 2025Updated 9 months ago
saccharomycetes / visual_crop_zsvqa
View on GitHub
☆11Apr 10, 2024Updated last year
ARiSE-Lab / CYCLE_OOPSLA_24
View on GitHub
Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"
☆10Mar 8, 2024Updated last year
DLR-SC / JokeGPT-WASSA23
View on GitHub
This repository includes the implementation and results of the paper "ChatGPT is fun, but it is not funny! Humor is still challenging Lar…
☆13Jul 13, 2023Updated 2 years ago
eddyhhlure1Eddy / ComfyUI-EddySevenResonance
View on GitHub
lucky_seed
☆11Nov 2, 2025Updated 3 months ago