Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
Alternatives and similar repositories for llmjudge
Users that are interested in llmjudge are comparing it to the libraries listed below
Sorting:
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- EMMA [TMLR 2025]☆12Sep 25, 2025Updated 5 months ago
- ☆22Jan 13, 2025Updated last year
- ☆22Jan 5, 2024Updated 2 years ago
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆25Mar 27, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 2 months ago
- ☆49May 13, 2024Updated last year
- ☆54Oct 24, 2024Updated last year
- Llama cute voice assistant☆27Sep 10, 2023Updated 2 years ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 2 years ago
- treemind interprets tree models☆41Jul 23, 2025Updated 7 months ago
- Detecting car parking slot on Open car park space☆13Oct 21, 2019Updated 6 years ago
- Oak National Academy's AI Auto Eval tools provide LLM as a judge evaluation on lesson plans and resources☆17Nov 4, 2025Updated 4 months ago
- benchmarks for LLM tokenizers☆17Feb 27, 2026Updated last week
- ☆35Nov 17, 2021Updated 4 years ago
- ☆12Jan 21, 2025Updated last year
- A Python-based voice assistant integrating speech-to-text (STT), text-to-speech (TTS), and powerful AI capabilities using either a local …☆13Dec 8, 2025Updated 3 months ago
- The classic movies redux with machine learning using TensorFlow and Keras.☆11Feb 12, 2019Updated 7 years ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- ☆17Feb 6, 2025Updated last year
- A frontend interface for interacting with AI Models. Compatible with Ollama and OpenAI☆10May 1, 2025Updated 10 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆38Jan 29, 2024Updated 2 years ago
- The one who calls upon functions - Function-Calling Language Model☆36Oct 2, 2023Updated 2 years ago
- ☆29Apr 22, 2024Updated last year
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)☆24Feb 11, 2026Updated 3 weeks ago
- This is a A/B test project from Udacity.☆12Dec 24, 2019Updated 6 years ago
- A simple example for PySpark based project.☆11Jun 3, 2016Updated 9 years ago
- Detect-Then-Explain Framework for Text-to-SQL task☆10Dec 6, 2023Updated 2 years ago
- A review of class imbalanced problems using data augumentation and ensemble learning☆10Mar 15, 2023Updated 2 years ago
- Our repo containes a Efficient RGB-D features extractor to category-level and instance-level 6D pose estimation.☆14Oct 29, 2025Updated 4 months ago
- ☆13Aug 4, 2022Updated 3 years ago
- Reinforcement Learning (PPO) applied to a multiplayer simple card game (Witches)☆10Jun 7, 2020Updated 5 years ago
- Repo for our work "Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence"☆19Jun 2, 2025Updated 9 months ago
- ☆22Jun 10, 2025Updated 8 months ago
- Implements Global Word Vectors.☆11Feb 8, 2020Updated 6 years ago
- ☆10Nov 7, 2022Updated 3 years ago
- Running Microsoft's BitNet via Electron, React & Astro☆53Sep 26, 2025Updated 5 months ago