rungalileo / agent-leaderboardLinks
Ranking LLMs on agentic tasks
☆196Updated last month
Alternatives and similar repositories for agent-leaderboard
Users that are interested in agent-leaderboard are comparing it to the libraries listed below
Sorting:
- ☆232Updated 3 months ago
- Readymade evaluators for agent trajectories☆365Updated last month
- Research repository on interfacing LLMs with Weaviate APIs. Inspired by the Berkeley Gorilla LLM.☆136Updated 2 months ago
- Tutorial for building LLM router☆232Updated last year
- Official Code for Oᴘᴇɴ-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models (EMNLP Findings 2024)☆138Updated 8 months ago
- Repository demonstrating best practices and patterns for implementing agentic workflows in Python, featuring modular, scalable, and reusa…☆172Updated last year
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆120Updated 8 months ago
- ☆171Updated this week
- A list of AI memory projects☆235Updated 9 months ago
- A bot with memory, built on LangGraph Cloud.☆138Updated last year
- ☆181Updated 8 months ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆138Updated 8 months ago
- ☆79Updated 9 months ago
- ☆146Updated last year
- Learn to build and customize multi-agent systems using the AutoGen. The course teaches you to implement complex AI applications through a…☆117Updated last year
- An agent benchmark with tasks in a simulated software company.☆570Updated 2 weeks ago
- ☆73Updated last year
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆463Updated 2 months ago
- ☆215Updated 3 months ago
- Salesforce Enterprise Deep Research☆680Updated this week
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆114Updated last year
- Testing and evaluation framework for voice agents☆152Updated 4 months ago
- ☆96Updated 7 months ago
- ⚖️ The First Coding Agent-as-a-Judge☆646Updated 5 months ago
- Training setup for Langchain's Open Deep Research☆67Updated 2 months ago
- Research assistant for performing online research on a given topic, using Llamaindex Workflows and Tavily API. Inspired by GPT-Researcher☆168Updated last year
- An Awesome list of curated DSPy resources.☆461Updated 3 weeks ago
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆134Updated last year
- 🤗 Benchmark Large Language Models Reliably On Your Data☆406Updated 3 weeks ago
- Rank LLMs, RAG systems, and prompts using automated head-to-head evaluation☆105Updated 10 months ago