A comprehensive evaluation framework for AI agents and LLM applications.
☆127May 14, 2026Updated last week
Alternatives and similar repositories for evals
Users that are interested in evals are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 9 months ago
- Amazon Nova Act is an AWS service for building and deploying highly reliable AI agents that automate UI-based workflows at scale.☆64Apr 30, 2026Updated 3 weeks ago
- From nothing to a deployed object detection model on SageMaker with Detectron2☆29Oct 17, 2023Updated 2 years ago
- GPT4 based personalized ArXiv paper assistant bot☆12Mar 1, 2024Updated 2 years ago
- The new terminal experience for AgentCore!☆133May 18, 2026Updated last week
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Edit and Generate Anything in 3D world!☆13Apr 15, 2023Updated 3 years ago
- Python CLI toolkit for Amazon Bedrock AgentCore (legacy). For new projects, use the AgentCore CLI: https://github.com/aws/agentcore-cli☆484May 13, 2026Updated last week
- ☆17Updated this week
- ☆34Dec 13, 2025Updated 5 months ago
- beko-translateは、Apple Silicon Mac向けのCLI翻訳ツールです。PDF見開き翻訳機能も同梱してあり原文・訳文を交互に表示できます。☆35Mar 25, 2026Updated 2 months ago
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated last year
- [ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing☆28May 11, 2026Updated 2 weeks ago
- ☆13Mar 14, 2024Updated 2 years ago
- 👨💼Python Wrapper for the Linkedin API☆22Jun 30, 2018Updated 7 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Manage Workflows with optional Scheduler or Event Arc triggers☆21Feb 24, 2026Updated 3 months ago
- ☆15Apr 26, 2026Updated 3 weeks ago
- Deep learning for pedestrians: backpropagation in CNNs. Latex and PyTorch code to verify theoretical derivations.☆13Jun 21, 2022Updated 3 years ago
- GPT-4 を用いて、言語モデルの応答を自動評価するスクリプト☆17Jun 6, 2024Updated last year
- Self-hosting Langfuse on Amazon ECS with Fargate using CDK Python☆77Jun 24, 2025Updated 11 months ago
- Insurance AI Assistant A smart system combining PostgreSQL, Milvus, and specialized AI agents (Life/Home/Auto) to answer insurance querie…☆30Apr 29, 2025Updated last year
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆17Aug 24, 2022Updated 3 years ago
- LLMPerf is a library for validating and benchmarking LLMs☆11Aug 13, 2024Updated last year
- An example agent demonstrating streaming, tool use, and interactivity from your terminal. This agent builder can help you to build your o…☆413May 12, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [CVPR 2026] Official Implementation of Edit2Perceive☆39Feb 21, 2026Updated 3 months ago
- How to build a simplified Corrective RAG assistant with Amazon Bedrock using LLMs, Embeddings model, Knowledge Bases for Amazon Bedrock, …☆16May 22, 2024Updated 2 years ago
- Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs. EMNLP 2024☆27Nov 13, 2024Updated last year
- Examples showing use of NGC containers and models withing Amazon SageMaker☆17Oct 4, 2022Updated 3 years ago
- ☆13Mar 25, 2023Updated 3 years ago
- GPT-2 Metadata Pretraining Towards Instruction Finetuning for Ukrainian☆20Aug 6, 2023Updated 2 years ago
- Multi-modal Assistant With Advanced RAG And Amazon Bedrock Claude 3☆20Feb 7, 2025Updated last year
- ☆25Nov 18, 2025Updated 6 months ago
- ☆25Apr 8, 2026Updated last month
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Build complex, serverless, and highly scalable generative AI applications with prompt chaining.☆316May 5, 2026Updated 2 weeks ago
- My dotfiles managed by chezmoi☆34Updated this week
- This repository is part of a blog post that guides users through creating a NLU search application using Amazon SageMaker and Amazon Elas…☆17Jul 19, 2023Updated 2 years ago
- AI/ML/DL & GenAI resources & projects. DeepLearning.AI, Hugging Face, OpenAI, Amazon Bedrock, Google Vertex AI, and more.☆14Apr 21, 2026Updated last month
- ☆13Jul 14, 2025Updated 10 months ago
- E コマースにおける生成AI 4大ユースケースに関する Amazon Bedrock デモ☆18Feb 19, 2025Updated last year
- hierarchical core-periphery structure☆10Jul 21, 2023Updated 2 years ago