A comprehensive evaluation framework for AI agents and LLM applications.
☆116Apr 29, 2026Updated this week
Alternatives and similar repositories for evals
Users that are interested in evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- Amazon Nova Act is an AWS service for building and deploying highly reliable AI agents that automate UI-based workflows at scale.☆64Updated this week
- From nothing to a deployed object detection model on SageMaker with Detectron2☆29Oct 17, 2023Updated 2 years ago
- The IDP Accelerator provides a scalable, serverless approach for automated document processing and information extraction using AWS servi…☆230Updated this week
- Manage Workflows with optional Scheduler or Event Arc triggers☆21Feb 24, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆15Apr 26, 2026Updated last week
- Deep learning for pedestrians: backpropagation in CNNs. Latex and PyTorch code to verify theoretical derivations.☆13Jun 21, 2022Updated 3 years ago
- ☆15Jul 4, 2025Updated 10 months ago
- Blueprint for running AWS Bedrock Multi-Agent AI collaboration with CDK, Graph DB, Streamlit and LangFuse☆21May 2, 2025Updated last year
- ☆25Jun 5, 2024Updated last year
- ☆15Apr 20, 2026Updated 2 weeks ago
- Biologically-inspired persistent memory engine for Claude Code. 26 cognitive subsystems, Hopfield networks, predictive coding, causal dis…☆54Apr 1, 2026Updated last month
- LLMPerf is a library for validating and benchmarking LLMs☆11Aug 13, 2024Updated last year
- BedrockSmith - CloudWatch Logsに出力したBedrockの呼び出しログを整形して表示します☆13Feb 3, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- C# .NET bit-parallel accelerated fuzzy string matching implementation of Seat Geek's well known python FuzzyWuzzy algorithm.☆33Apr 26, 2026Updated last week
- Examples showing use of NGC containers and models withing Amazon SageMaker☆17Oct 4, 2022Updated 3 years ago
- ☆13Mar 25, 2023Updated 3 years ago
- Multi-modal Assistant With Advanced RAG And Amazon Bedrock Claude 3☆20Feb 7, 2025Updated last year
- ☆25Nov 18, 2025Updated 5 months ago
- ☆25Apr 8, 2026Updated 3 weeks ago
- VS Code Clinical Quality Language Extension☆12Apr 23, 2026Updated last week
- Build complex, serverless, and highly scalable generative AI applications with prompt chaining.☆315Updated this week
- 한국어 소설 텍스트를 위한 자연어처리 라이브러리입니다. Natural Language Processing Library for Korean Literary Text. (Will be open in February, 2024)☆12Jan 16, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Heroku/Dash app for inDelphi.☆11Dec 8, 2022Updated 3 years ago
- ☆25May 29, 2025Updated 11 months ago
- E コマースにおける生成AI 4大ユースケースに関する Amazon Bedrock デモ☆18Feb 19, 2025Updated last year
- hierarchical core-periphery structure☆10Jul 21, 2023Updated 2 years ago
- EvalBench is a flexible framework designed to measure the quality of generative AI (GenAI) workflows around database specific tasks.☆42Updated this week
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 11 months ago
- Implementation of a LangGraph.js CheckpointSaver that uses a AWS's DynamoDB☆16Feb 10, 2025Updated last year
- In-BoXBART: Get Instructions into Biomedical Multi-task Learning☆15Aug 23, 2022Updated 3 years ago
- Swallowプロジェクト 大規模言語モデル 評価スクリプト☆24Sep 17, 2025Updated 7 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Bayesball: Bayesian analysis of batting average☆12Mar 4, 2018Updated 8 years ago
- code for epidemics spreading, heterogeneous random walk on network☆13Apr 12, 2021Updated 5 years ago
- AWSレベル判定くん☆25Feb 22, 2026Updated 2 months ago
- Know who your representative in HoPR is.☆42Apr 23, 2026Updated last week
- ☆166May 19, 2025Updated 11 months ago
- A model-driven approach to building AI agents in just a few lines of code.☆5,765Updated this week
- Developed a GPA & CGPA calculator website tailored for SRM IST, Trichy Campus, to simplify academic performance tracking. This user-frien…☆11Jul 22, 2024Updated last year