rungalileo / agent-leaderboardLinks
Ranking LLMs on agentic tasks
☆205Updated last month
Alternatives and similar repositories for agent-leaderboard
Users that are interested in agent-leaderboard are comparing it to the libraries listed below
Sorting:
- Readymade evaluators for agent trajectories☆436Updated 3 months ago
- ☆236Updated last month
- Tutorial for building LLM router☆240Updated last year
- A bot with memory, built on LangGraph Cloud.☆144Updated last year
- Repository demonstrating best practices and patterns for implementing agentic workflows in Python, featuring modular, scalable, and reusa…☆182Updated last year
- ☆182Updated 10 months ago
- Workflows are an event-driven, async-first, step-based way to control the execution flow of AI applications like agents.☆298Updated last week
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆140Updated 4 months ago
- Terminal-based AI Coding Agent, similar to Claude Code, OpenAI Codex etc. but works with many more LLMs e.g. Gemini, Groq, Deepseek☆151Updated 7 months ago
- ☆74Updated last year
- ☆179Updated last week
- MCP (Model Context Protocol) server for Weaviate☆161Updated 7 months ago
- ☆219Updated 5 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆486Updated 4 months ago
- ☆117Updated 4 months ago
- Together Open Deep Research☆356Updated 8 months ago
- Research assistant for performing online research on a given topic, using Llamaindex Workflows and Tavily API. Inspired by GPT-Researcher☆169Updated last year
- An example of multi-agent orchestration with llama-index☆444Updated 11 months ago
- An Awesome list of curated DSPy resources.☆494Updated 3 weeks ago
- Building LLM-Enabled Multi Agent Applications from Scratch☆295Updated 2 weeks ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆300Updated 2 weeks ago
- Testing and evaluation framework for voice agents☆161Updated 6 months ago
- 👩⚖️ Agent-as-a-Judge: The Magic for Open-Endedness☆693Updated 7 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆142Updated 10 months ago
- ☆214Updated 3 months ago
- ☆148Updated 3 weeks ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 8 months ago
- A programming framework for agentic AI. Discord: https://discord.gg/pAbnFJrkgZ☆137Updated 10 months ago
- ☆148Updated last year
- ☆103Updated 9 months ago