A curated, non-BS library of the best resources for building and evaluating AI agents — papers, blogs, talks, tools, benchmarks. Maintained by BenchFlow.
☆532Jun 27, 2026Updated last week
Alternatives and similar repositories for awesome-evals
Users that are interested in awesome-evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An implementation of Dijkstra in Clojure☆19Aug 7, 2012Updated 13 years ago
- NFT Badge for staking on Polygon. This smart contract give a staker a NFT that represents staking period (= vesting period ) which a stak…☆11May 31, 2021Updated 5 years ago
- Browser-based ontology workbench for OWL ontologies and SKOS vocabularies. Streamlit + rdflib, no Java, no Protégé. Bulk operations, OWL-…☆120Updated this week
- Mainframe bruter and screen automation utility.☆20Jul 27, 2021Updated 4 years ago
- Synology DLNA scrobbler for trakt.tv☆15Jul 25, 2014Updated 11 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Metadata Editor user and practice guide☆19May 8, 2026Updated last month
- Running LLMs against a sandbox airport to see if they can make the correct decisions in real time☆29Jul 22, 2025Updated 11 months ago
- Convert various blog dumps to a standard JSON☆12Mar 29, 2026Updated 3 months ago
- protoc plugin to publish protobuf messages with Watermill☆12May 6, 2026Updated last month
- 📝The official repository of "Rethinking Cross-Generator Image Forgery Detection through DINOv3"☆25Dec 2, 2025Updated 7 months ago
- Test your local LLMs on the AIME problems☆39Jun 7, 2025Updated last year
- PDF reader☆11Sep 13, 2017Updated 8 years ago
- A simple and elegant assertion library for input validation.☆10Aug 4, 2016Updated 9 years ago
- Go framework for language model-powered applications with composability and chaining. Inspired by LangChain.☆12May 2, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Current Alpha version of the ONTO-TRON-5000☆41Dec 1, 2025Updated 7 months ago
- ☆16May 25, 2025Updated last year
- A really fast document ranking engine using BM25 and TF-IDF. Based on Python using NLP packages NLTK and spacY.☆17May 8, 2018Updated 8 years ago
- ☆10Jul 22, 2021Updated 4 years ago
- Go package providing an implementation of HTTP Content Negotiation compliant with RFC 7231☆13Oct 10, 2021Updated 4 years ago
- ☆19Aug 19, 2025Updated 10 months ago
- helps you estimate how long software tasks will take☆23May 11, 2025Updated last year
- ☆54Jun 18, 2026Updated 2 weeks ago
- The purpose of this repo is to demonstrate how easy it is to create "Human-In-The-Loop" Durable Tools for MCP servers by leveraging Tempo…☆21Aug 14, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- a conversational finance assistant that provides users with real-time stock quotes, market news, and insights on market movers through na…☆17Apr 26, 2025Updated last year
- CakePHP Utility Plugin☆21Oct 30, 2024Updated last year
- Call OpenAI from Google sheets and Enforce Schema Output☆14Jun 15, 2024Updated 2 years ago
- A repo for my CCN Coding Club talk, 'Python, Rust, and You: Modern Py-Rust Interoperation'☆45Aug 12, 2025Updated 10 months ago
- xoshiro256** random number generator☆22May 7, 2018Updated 8 years ago
- A scalable concurrent collaboration framework based on Operational Transformation (OT)☆14Dec 10, 2022Updated 3 years ago
- Collection of tips for using textgen in various ways☆20Aug 30, 2024Updated last year
- Claude Code best practices -- applied to application design. Interactive HLD/LLD visualization, implementation example. LLM-agnostic, DB-…☆52Feb 28, 2026Updated 4 months ago
- 斯坦福工作 Generative Agents的复现和翻译 An attempt to build a working, locally-running cheap version of Generative Agents: Interactive Simulacra of…☆86Apr 26, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆13Sep 26, 2018Updated 7 years ago
- Go slice collections handling☆12Feb 11, 2020Updated 6 years ago
- 🔥 Open Source AI Agents with Self-improvement Capabilities 🔥☆25Sep 17, 2025Updated 9 months ago
- Node.js project starter on steroids: quickly create a Node.js app AND generate source code for data models + REST/GraphQL APIs (the gener…☆16Jan 4, 2023Updated 3 years ago
- The open-source adapter for working with RDF databases and SPARQL queries in Jupyter notebooks leveraging the yFiles Graphs for Jupyter p…☆24Apr 4, 2025Updated last year
- Go library for the JW Platform API☆12May 31, 2023Updated 3 years ago
- A concurrent toolkit to help execute funcs concurrently in an efficient and safe way. It supports specifying the overall timeout to avoid…☆17May 31, 2026Updated last month