Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
Alternatives and similar repositories for llmjudge
Users that are interested in llmjudge are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and va…☆12Nov 6, 2023Updated 2 years ago
- ☆22Jan 13, 2025Updated last year
- EMMA [TMLR 2025]☆13Sep 25, 2025Updated 8 months ago
- ☆22Jan 5, 2024Updated 2 years ago
- ☆25Dec 12, 2025Updated 5 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- YAMLE: Yet Another Machine Learning Environment☆34Jan 19, 2025Updated last year
- Multi-task modelling extensions for huggingface transformers☆21Mar 3, 2023Updated 3 years ago
- ☆50May 13, 2024Updated 2 years ago
- ☆29Apr 8, 2025Updated last year
- ☆54Oct 24, 2024Updated last year
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 3 years ago
- ☆12Jan 21, 2025Updated last year
- URDF description of the JVRC humanoid model☆15Jan 9, 2025Updated last year
- Local Ollama with Qdrant RAG: Embed, index, and enhance models for retrieval-augmented generation. Get started with easy setup for powerf…☆25Mar 27, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- treemind interprets tree models☆41Apr 9, 2026Updated last month
- ☆35Nov 17, 2021Updated 4 years ago
- Logical inference system based on event semantics and degree semantics in formal semantics☆10Jan 22, 2023Updated 3 years ago
- EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization☆13Mar 20, 2025Updated last year
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 4 years ago
- ☆13Nov 15, 2022Updated 3 years ago
- Implements Global Word Vectors.☆11Feb 8, 2020Updated 6 years ago
- ☆10Aug 27, 2019Updated 6 years ago
- Benchmarking framework for Clojure☆10Feb 27, 2019Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A curated list of personalized Language model / Large language model (continually updated)☆10Nov 17, 2023Updated 2 years ago
- [IROS 2025] SIME: Enhancing Policy Self-Improvement with Modal-level Exploration☆17Mar 2, 2026Updated 3 months ago
- Repository for "I am a Strange Dataset: Metalinguistic Tests for Language Models"☆46Jan 11, 2024Updated 2 years ago
- Refined Direct Preference Optimization with Synthetic Data for Behavioral Alignment of LLMs☆13Feb 13, 2024Updated 2 years ago
- Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models☆26Dec 21, 2025Updated 5 months ago
- 2018年春季工科创IV-E:智能小车机器人☆10May 10, 2018Updated 8 years ago
- Investigating Cultural Alignment of Large Language Models☆13Aug 14, 2024Updated last year
- Deploying a custom pytorch model to AWS Sagemaker using terraform and FastAPI☆10Nov 10, 2023Updated 2 years ago
- Official code for "From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation" (ICLR2026)☆36Mar 1, 2026Updated 3 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Run Deekseek LLM model locally with Ollama, deepseek-r1:1.5b, and React☆11Jan 29, 2025Updated last year
- ☆12Feb 18, 2020Updated 6 years ago
- A tiny collection of robotics problems, for learning and for fun☆16Apr 1, 2025Updated last year
- Using Vrep to simulate a six-legged robot to do motion planning & path planning☆10Jan 10, 2019Updated 7 years ago
- Framework and CL tools for hight throughput manipulation on RDF statements (triples and quads).☆10Nov 14, 2020Updated 5 years ago
- [TMLR 2025] A collection of research papers on constraint inference within the field of RL☆11May 9, 2025Updated last year
- 🛠 Self-hosted, fast, and consistent remote configuration for apps.☆17Nov 7, 2022Updated 3 years ago