Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
Alternatives and similar repositories for llmjudge
Users that are interested in llmjudge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- ☆22Jan 13, 2025Updated last year
- EMMA [TMLR 2025]☆12Sep 25, 2025Updated 6 months ago
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆22Jan 5, 2024Updated 2 years ago
- ☆25Dec 12, 2025Updated 4 months ago
- Multi-task modelling extensions for huggingface transformers☆21Mar 3, 2023Updated 3 years ago
- ☆32Jul 11, 2024Updated last year
- ☆29Apr 8, 2025Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆23Jan 5, 2026Updated 3 months ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 3 years ago
- Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…☆10Apr 14, 2023Updated 3 years ago
- Reagent interface to the Mafs interactive 2d math visualization library.☆15Jun 1, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- URDF description of the JVRC humanoid model☆15Jan 9, 2025Updated last year
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆25Mar 27, 2024Updated 2 years ago
- ☆35Nov 17, 2021Updated 4 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated last year
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 3 years ago
- ☆12Nov 15, 2022Updated 3 years ago
- Implements Global Word Vectors.☆11Feb 8, 2020Updated 6 years ago
- Scriptable interface to a powerful, multi-lingual language server☆37Apr 12, 2026Updated last week
- ☆10Aug 27, 2019Updated 6 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆13Aug 4, 2022Updated 3 years ago
- 2018年春季工科创IV-E:智能小车机器人☆10May 10, 2018Updated 7 years ago
- Uncertainty-Aware Curriculum Learning for Neural Machine Translation (ACL 2020)☆11Jun 12, 2020Updated 5 years ago
- code for promptCSE, emnlp 2022☆11Apr 10, 2023Updated 3 years ago
- Deploying a custom pytorch model to AWS Sagemaker using terraform and FastAPI☆10Nov 10, 2023Updated 2 years ago
- Run Deekseek LLM model locally with Ollama, deepseek-r1:1.5b, and React☆11Jan 29, 2025Updated last year
- ☆15Aug 19, 2024Updated last year
- The Clojure library for JSON-LD (JavaScript Object Notation for Linking Data).☆16Feb 7, 2019Updated 7 years ago
- Package (ROS 1 & ROS 2) for human keypoints identification, 3D reconstruction, tracking, and filtering in collaborative robotics.☆18Nov 20, 2025Updated 5 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Review of dental related datasets for machine learning☆13Feb 24, 2026Updated last month
- Stable Diffusion web UI☆10Mar 17, 2024Updated 2 years ago
- SiMM: Scalable in-Memory Middleware☆39Updated this week
- ☆12Apr 20, 2023Updated 3 years ago
- Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments (Zhou et al., EMNLP 2024)☆14Oct 3, 2024Updated last year
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models☆16Sep 27, 2024Updated last year
- Tools for working with Iterators of Iterators of ...., with particular application in NLP which has Corpus made up of Document made up of…☆13Aug 27, 2021Updated 4 years ago