MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability
☆45Mar 17, 2026Updated last month
Alternatives and similar repositories for MCPToolBenchPP
Users that are interested in MCPToolBenchPP are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ComfyUI node for modular, human‑like Kani TTS. Generate natural, high‑quality speech from text☆38Oct 17, 2025Updated 6 months ago
- ☆14Apr 27, 2022Updated 4 years ago
- A PoC to trigger CVE-2023-5217 from the Browser WebCodecs or MediaRecorder interface.☆17Oct 11, 2023Updated 2 years ago
- ☆31Feb 27, 2025Updated last year
- ☆28Feb 11, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Notes and work-in-progress for BPF-related research projects☆12Jan 10, 2025Updated last year
- LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians☆25Jan 10, 2025Updated last year
- This is the official code repository for the paper: Towards General Continuous Memory for Vision-Language Models.☆26Jul 3, 2025Updated 10 months ago
- Python Wrapper for RnNoise v0.2☆77Jan 14, 2026Updated 3 months ago
- A Sparse-tensor Communication Framework for Distributed Deep Learning☆13Nov 1, 2021Updated 4 years ago
- An omnipowerful personal assistant powered by LLMs, Zapier NLA, and custom actions.☆15Sep 13, 2024Updated last year
- LiveMCPBench is a benchmark for evaluating the ability of agents to navigate and utilize a large-scale MCP toolset. It provides a compreh…☆97Dec 18, 2025Updated 4 months ago
- FLoRA: A Framework for Learning Scoring Rules in Autonomous Driving Planning Systems☆13Apr 12, 2026Updated 3 weeks ago
- Script to demonstrate how to use a Language Model for Semantic Turn Detection. Refer to blog post for full details.☆17May 9, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Apr 20, 2025Updated last year
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated 11 months ago
- AIRS-Bench: an AI Research Science benchmark for quantifying the end-to-end AI research abilities of LLM agents☆84Apr 24, 2026Updated last week
- decision-making processes of human drivers☆13Mar 28, 2024Updated 2 years ago
- PeTAL: Ensuring Access Control Integrity against Data-only Attacks on Linux (ACM CCS 2024)☆16Nov 4, 2024Updated last year
- [WWW '24] UnifiedSSR: A Unified Framework of Sequential Search and Recommendation☆12Feb 16, 2024Updated 2 years ago
- Röttger et al. (2025): "MSTS: A Multimodal Safety Test Suite for Vision-Language Models"☆17Mar 31, 2025Updated last year
- PFI: Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents☆28Mar 26, 2025Updated last year
- Deep Learning 2021 in School of Data Science, USTC☆12May 17, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Example repo showcasing model training and deployment with distil claude cli skill☆56Jan 19, 2026Updated 3 months ago
- [ICASSP 2026] Task Vector in TTS: Toward Emotionally Expressive Dialectal Speech Synthesis☆39Dec 24, 2025Updated 4 months ago
- Templates and examples for ACL and EMNLP conference posters.☆14Oct 5, 2024Updated last year
- The official implementation of the paper "AgentDyn: A Dynamic Open-Ended Benchmark for Evaluating Prompt Injection Attacks of Real-World …☆48Apr 19, 2026Updated 2 weeks ago
- Official code of "The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer Markets"☆25Mar 24, 2026Updated last month
- A TTS Trained on Universal Audio.☆41Jun 6, 2025Updated 11 months ago
- General benchmarking apparatus for running multi-agent systems against benchmarks☆46Apr 13, 2026Updated 3 weeks ago
- ☆36Aug 29, 2025Updated 8 months ago
- Fine-tune LLMs and ML models with automatic dataset conversion, hyperparameter sweeps, and custom RL environments☆52Mar 14, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A comprehensive repository for Compute Express Link (CXL) resources: covering research papers, specifications, simulation/emulation tools…☆25Feb 24, 2026Updated 2 months ago
- Scaling Agentic Environments Automatically.☆62Mar 26, 2026Updated last month
- My YouTube tutorial codes☆14Oct 10, 2025Updated 6 months ago
- This code was written quite some time ago for the purpose of processing the NGSIM dataset. While it might not be the epitome of organizat…☆10Oct 5, 2023Updated 2 years ago
- ☆14Nov 22, 2024Updated last year
- a simple pokerogue.net save editor☆11May 14, 2024Updated last year
- Application of Retrieval-Augmented Reasoning on a domain-specific body of knowledge☆34Feb 27, 2026Updated 2 months ago