ai-evals-course/judgy

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ai-evals-course/judgy)

ai-evals-course / judgy

Python package for estimating a CIs for metrics evaluated by LLM-as-Judges.

☆91

Alternatives and similar repositories for judgy

Users that are interested in judgy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ai-evals-course / recipe-chatbot
View on GitHub
☆321Mar 24, 2026Updated 4 months ago
ai-evals-course / isaac-fasthtml-workshop
View on GitHub
☆69Aug 5, 2025Updated 11 months ago
zenbase-ai / aiai-cli
View on GitHub
☆17Dec 16, 2025Updated 7 months ago
MaximeRivest / ovllm
View on GitHub
☆39Aug 4, 2025Updated 11 months ago
kmad / dabench-rlm-eval
View on GitHub
Benchmark harness for evaluating DSPy RLMs on data analysis tasks (InfiAgent-DABench)
☆23Mar 22, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hamelsmu / evals-skills
View on GitHub
Skills for AI Evals to compliment the course: AI Evals For Engineers & PMs
☆1,558Jun 10, 2026Updated last month
shreyashankar / error-discovery-skill
View on GitHub
Interactive error analysis skill for AI agents. Studies LLM trace datasets, builds a review UI, monitors annotations, categorizes failure…
☆160Jun 25, 2026Updated last month
raveeshbhalla / dspy-gepa-logger
View on GitHub
☆58Jan 28, 2026Updated 6 months ago
ruvnet / aws-dev
View on GitHub
AWS Dev enviroment
☆19May 20, 2024Updated 2 years ago
darinkishore / codex_dspy
View on GitHub
DSPy module for OpenAI Codex SDK - signature-driven agentic workflows
☆165Dec 8, 2025Updated 7 months ago
hamelsmu / hamel
View on GitHub
General Utilities
☆58Jun 21, 2026Updated last month
AnswerDotAI / web2md-ext
View on GitHub
Get a markdown version of any webpage with a keyboard shortcut.
☆69Feb 17, 2025Updated last year
sebastianschramm / fastapi_hf_endpoints
View on GitHub
Custom fastapi server packaged as docker image for Huggingface inference endpoints deployment
☆13Apr 17, 2024Updated 2 years ago
haizelabs / verdict
View on GitHub
Inference-time scaling for LLMs-as-a-judge.
☆346Nov 5, 2025Updated 8 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Archelunch / dspy-repl
View on GitHub
☆46Feb 20, 2026Updated 5 months ago
eugeneyan / align-app
View on GitHub
☆98Nov 9, 2024Updated last year
americanthinker / rag-applications
View on GitHub
RAG applications repo for Uplimit course
☆10Jul 20, 2025Updated last year
evalops / dspy-micro-agent
View on GitHub
Minimal agent runtime built with DSPy modules and a thin Python loop. Includes CLI, FastAPI server, and eval harness with OpenAI/Ollama s…
☆75Apr 25, 2026Updated 3 months ago
labdac / Meta-Prod2Vec
View on GitHub
Repository for experiments with MetaProd2Vec and related algorithms.
☆59Mar 16, 2019Updated 7 years ago
Nadkarni-Lab / LLM_CodeQuery
View on GitHub
Code repository for "Generative Large Language Models are Poor Medical Coders: A Benchmarking Analysis of Medical Code Querying"
☆13Sep 16, 2024Updated last year
nickaggarwal / nvidia-triton-llm-streaming
View on GitHub
Integrating SSE with NVIDIA Triton Inference Server using a Python backend and Zephyr model. There is very less documentation how to use …
☆10May 29, 2024Updated 2 years ago
weaviate / retrieve-dspy
View on GitHub
A collection of Compound Retrieval Systems implemented with DSPy and Weaviate.
☆99Jun 1, 2026Updated last month
ucbepic / docetl
View on GitHub
A system for agentic LLM-powered data processing and ETL
☆3,950Jul 21, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MaximeRivest / parakeet-stream
View on GitHub
Simple, powerful streaming transcription for Python using NVIDIA's Parakeet TDT
☆21Oct 18, 2025Updated 9 months ago
tcapelle / mistral_wandb
View on GitHub
A full fledged mistral+wandb
☆13Aug 16, 2024Updated last year
ag-ds-bubble / tseuler
View on GitHub
A library for Time-Series exploration, analysis & modelling.
☆17Dec 10, 2020Updated 5 years ago
softwaredoug / local-llm-judge
View on GitHub
Local LLM as a search relevance judge
☆30Mar 2, 2025Updated last year
SylphAI-Inc / adal-bootcamp-2
View on GitHub
☆15Jul 5, 2026Updated 3 weeks ago
TransluceAI / jailbreaking-frontier-models
View on GitHub
☆28Sep 3, 2025Updated 10 months ago
Ryu1845 / hyena-jax
View on GitHub
Implementation of Hyena Hierarchy in JAX
☆10Apr 30, 2023Updated 3 years ago
mlops-club / serverless-fastapi
View on GitHub
Example FastAPI app deployed to AWS with CDK.
☆16Feb 23, 2023Updated 3 years ago
laulauland / dotfiles
View on GitHub
☆18Jul 20, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
RoheLab / aPPR
View on GitHub
Approximate Personalized Page Rank
☆16Jun 27, 2024Updated 2 years ago
muellerzr / fastai2_tabular_hybrid
View on GitHub
Developing and integrating methods for fastai2 tabular with other datatypes
☆26Oct 6, 2022Updated 3 years ago
zwrankin / dash_tutorial
View on GitHub
Gentle introduction to dash development and deployment via Heroku
☆11Dec 8, 2022Updated 3 years ago
Reapor-Yurnero / imprompter
View on GitHub
Codebase of https://arxiv.org/abs/2410.14923
☆54Oct 22, 2024Updated last year
msull / consciousness-sim
View on GitHub
Winning Hackathon entry for Streamlit LLM Hackathon October 2023
☆16Oct 19, 2023Updated 2 years ago
huggingface / evaluation-guidebook
View on GitHub
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard a…
☆2,131Dec 3, 2025Updated 7 months ago
koaning / notebooks
View on GitHub
Notebook for safekeeps
☆30Jul 16, 2026Updated last week