Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
Alternatives and similar repositories for llmjudge
Users that are interested in llmjudge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- ☆22Jan 13, 2025Updated last year
- EMMA [TMLR 2025]☆14Sep 25, 2025Updated 9 months ago
- ☆22Jan 5, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- All-in-one Full-Featured Python/Flet/Flutter Application to make the most of all the latest Open-Source AI Art Generators in an intuitive…☆16May 30, 2025Updated last year
- ☆25Dec 12, 2025Updated 6 months ago
- Multi-task modelling extensions for huggingface transformers☆21Mar 3, 2023Updated 3 years ago
- ☆30Apr 8, 2025Updated last year
- ☆32Jul 11, 2024Updated last year
- Agentkube - Run Kubernetes Like Never Before☆38Mar 1, 2026Updated 3 months ago
- Thesis project about Visual Anomaly Detection based on Self Supervised Learning. The model identifies anomalies from information acquired…☆10Apr 14, 2023Updated 3 years ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 3 years ago
- ☆12Jan 21, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Reagent interface to the Mafs interactive 2d math visualization library.☆15Jun 1, 2024Updated 2 years ago
- URDF description of the JVRC humanoid model☆15Jan 9, 2025Updated last year
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆25Mar 27, 2024Updated 2 years ago
- ☆34Nov 17, 2021Updated 4 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated last year
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 4 years ago
- ☆10Aug 27, 2019Updated 6 years ago
- Scriptable interface to a powerful, multi-lingual language server☆44Updated this week
- Pytorch optimizers implementing Hilbert Constrained Gradient Descent☆19May 9, 2019Updated 7 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Gemini Live API + function calling for patient intake☆24Nov 8, 2025Updated 7 months ago
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs☆13Feb 13, 2024Updated 2 years ago
- Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models☆27Dec 21, 2025Updated 6 months ago
- ☆16Mar 27, 2023Updated 3 years ago
- Investigating Cultural Alignment of Large Language Models☆13Aug 14, 2024Updated last year
- Deploying a custom pytorch model to AWS Sagemaker using terraform and FastAPI☆10Nov 10, 2023Updated 2 years ago
- Run Deekseek LLM model locally with Ollama, deepseek-r1:1.5b, and React☆11Jan 29, 2025Updated last year
- ☆16Aug 19, 2024Updated last year
- A tiny collection of robotics problems, for learning and for fun☆16Apr 1, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆12Feb 18, 2020Updated 6 years ago
- Package (ROS 1 & ROS 2) for human keypoints identification, 3D reconstruction, tracking, and filtering in collaborative robotics.☆18Nov 20, 2025Updated 7 months ago
- Review of dental related datasets for machine learning☆19Feb 24, 2026Updated 4 months ago
- [TMLR 2025] A collection of research papers on constraint inference within the field of RL☆11May 9, 2025Updated last year
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆17Nov 7, 2022Updated 3 years ago
- ☆12Apr 20, 2023Updated 3 years ago
- ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models☆16Sep 27, 2024Updated last year