rungalileo / agent-leaderboardLinks
Ranking LLMs on agentic tasks
☆180Updated 2 weeks ago
Alternatives and similar repositories for agent-leaderboard
Users that are interested in agent-leaderboard are comparing it to the libraries listed below
Sorting:
- ☆230Updated last month
- Readymade evaluators for agent trajectories☆306Updated last month
- Tutorial for building LLM router☆224Updated last year
- ☆180Updated 6 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆398Updated 2 weeks ago
- An agent benchmark with tasks in a simulated software company.☆534Updated this week
- Repository demonstrating best practices and patterns for implementing agentic workflows in Python, featuring modular, scalable, and reusa…☆157Updated 10 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆133Updated 6 months ago
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆134Updated this week
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆108Updated last year
- ☆164Updated this week
- ☆94Updated 5 months ago
- This is the official repository for Auto-RAG.☆218Updated last month
- TapeAgents is a framework that facilitates all stages of the LLM Agent development lifecycle☆294Updated last week
- Official Code for Oᴘᴇɴ-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models (EMNLP Findings 2024)☆132Updated 6 months ago
- Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".☆238Updated last year
- ⚖️ The First Coding Agent-as-a-Judge☆605Updated 3 months ago
- Together Open Deep Research☆338Updated 4 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆276Updated last year
- 🤗 Benchmark Large Language Models Reliably On Your Data☆387Updated this week
- A list of AI memory projects☆205Updated 7 months ago
- ☆372Updated last year
- ☆76Updated 7 months ago
- An example of multi-agent orchestration with llama-index☆432Updated 7 months ago
- ☆71Updated 10 months ago
- Research assistant for performing online research on a given topic, using Llamaindex Workflows and Tavily API. Inspired by GPT-Researcher☆166Updated 11 months ago
- Build datasets using natural language☆518Updated 3 months ago
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆147Updated last year
- II-Researcher: a new open-source framework designed to aid building search / research agents☆468Updated 3 weeks ago
- Testing and evaluation framework for voice agents☆147Updated 2 months ago