Compendium of over 50 benchmarks for evaluating AI agents, categorized into Function Calling & Tool Use, General Assistant & Reasoning, Coding & Software Engineering, and Computer Interaction.
☆106Oct 15, 2025Updated 4 months ago
Alternatives and similar repositories for ai-agent-benchmark-compendium
Users that are interested in ai-agent-benchmark-compendium are comparing it to the libraries listed below
Sorting:
- CLAUDE.md Builder - Advanced meta-system for creating, optimizing, and mastering CLAUDE.md configurations☆46Jul 15, 2025Updated 7 months ago
- ☆27Feb 27, 2026Updated last week
- Example application for creating an MVC Express + Node + TypeScript app and deploying it to Azure☆10Nov 8, 2018Updated 7 years ago
- 📦 A collection of pastable code gathered from past projects☆12Sep 9, 2024Updated last year
- Python interface for Agora☆66Mar 8, 2025Updated 11 months ago
- DACache is a simple cache manager that simplifies caching of data to the file system.☆13Aug 7, 2017Updated 8 years ago
- ☆12Jun 19, 2024Updated last year
- IRPlayer is a powerful video player framework for iOS.☆15Dec 12, 2025Updated 2 months ago
- An unofficial template for React + Vite + Cloudflare Workers AI + Hono☆42Mar 22, 2025Updated 11 months ago
- Minimal agent runtime built with DSPy modules and a thin Python loop. Includes CLI, FastAPI server, and eval harness with OpenAI/Ollama s…☆70Dec 22, 2025Updated 2 months ago
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆14Apr 30, 2025Updated 10 months ago
- Repo for the IDESSAI 2024 course on modeling audio with discrete tokens.☆13Sep 13, 2024Updated last year
- ☆11Sep 29, 2023Updated 2 years ago
- A curated collection of my agent-skills☆25Jan 25, 2026Updated last month
- 🐰Easy resolving deep json using keypath in Dart☆12Mar 30, 2021Updated 4 years ago
- Snowflake LLM-based text to SQL and document retrieval in Streamlit☆46Nov 16, 2023Updated 2 years ago
- ☆10Aug 1, 2022Updated 3 years ago
- ☆16Apr 30, 2025Updated 10 months ago
- Adding MIDI to sheet music SVG. A project for the Music Encoding Initiative (MEI)☆10Feb 10, 2026Updated 3 weeks ago
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 2 years ago
- In the process of my codeing the learning and summary☆12Mar 6, 2019Updated 7 years ago
- FinanceRAG project by KAIST students. Advanced Retrieval-Augmented Generation (RAG) system designed for the financial domain.☆15Feb 11, 2025Updated last year
- assistant that runs entirely on‑device on Apple‑silicon Macs (M‑series). Chats with a 4‑bit Llama‑3 model accelerated by MLX, and speak…☆14Jun 13, 2025Updated 8 months ago
- ☆11Aug 26, 2024Updated last year
- Firebase application template built on moltres framework☆12Apr 17, 2023Updated 2 years ago
- A Framework for Symbolic MUsic Graph Explanations☆10Jul 30, 2025Updated 7 months ago
- 🏖️ Instill AI's cortex for frontend☆12Oct 18, 2023Updated 2 years ago
- Supervised and unsupervised Concept-based explanation of pretrained music classifiers☆12Jul 27, 2023Updated 2 years ago
- Guided meditation assistant, using scheduled messages with LLaMA☆10Nov 28, 2024Updated last year
- Canvas Element Recorder for React, with really simple API☆11Oct 16, 2023Updated 2 years ago
- You're probably looking for https://github.com/briancavalier/most-behave instead☆11Jul 19, 2018Updated 7 years ago
- Changes in this fork has been merged to upstream.☆16Jun 10, 2025Updated 8 months ago
- Korean Abstract Meaning Representation (AMR) Corpus☆10Feb 27, 2022Updated 4 years ago
- Swift Implementation of the Model Context Protocol (MCP) Spec☆10Mar 28, 2025Updated 11 months ago
- A reducer enhancer for using an xstate chart with redux☆13Mar 5, 2018Updated 8 years ago
- ☆25Sep 10, 2025Updated 5 months ago
- Run Claude Code (and codex) to generate a project plan, then run them in a loop for days until they're done☆14Jan 18, 2026Updated last month
- ☆26Oct 16, 2025Updated 4 months ago
- reactive state machines☆15Jan 7, 2023Updated 3 years ago