rungalileo / agent-leaderboardLinks
Ranking LLMs on agentic tasks
☆184Updated last week
Alternatives and similar repositories for agent-leaderboard
Users that are interested in agent-leaderboard are comparing it to the libraries listed below
Sorting:
- ☆231Updated 2 months ago
- Readymade evaluators for agent trajectories☆328Updated 2 weeks ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆135Updated 3 weeks ago
- Repository demonstrating best practices and patterns for implementing agentic workflows in Python, featuring modular, scalable, and reusa…☆167Updated 10 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆135Updated 7 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆109Updated last year
- ☆73Updated 10 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆437Updated 3 weeks ago
- Tutorial for building LLM router☆228Updated last year
- ☆165Updated last week
- ☆95Updated 5 months ago
- A list of AI memory projects☆218Updated 8 months ago
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆295Updated last week
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆418Updated 3 weeks ago
- An agent benchmark with tasks in a simulated software company.☆546Updated 3 weeks ago
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆120Updated 7 months ago
- ☆182Updated 7 months ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆391Updated 2 weeks ago
- MCP (Model Context Protocol) server for Weaviate☆152Updated 3 months ago
- ☆76Updated 6 months ago
- Together Open Deep Research☆346Updated 5 months ago
- Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"☆228Updated 3 months ago
- ☆198Updated 2 months ago
- This is the official repository for Auto-RAG.☆221Updated 2 months ago
- ☆145Updated last year
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆240Updated last year
- Dynamic Metadata based RAG Framework☆75Updated last year
- ☆76Updated 8 months ago
- LLM reads a paper and produce a working prototype☆56Updated 5 months ago
- ☆185Updated last week