aws-samples / evaluating-large-language-models-using-llm-as-a-judgeLinks
☆18Updated 4 months ago
Alternatives and similar repositories for evaluating-large-language-models-using-llm-as-a-judge
Users that are interested in evaluating-large-language-models-using-llm-as-a-judge are comparing it to the libraries listed below
Sorting:
- ☆19Updated 7 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆25Updated 6 months ago
- Question Answering Generative AI application with Large Language Models (LLMs) and Amazon OpenSearch Service☆25Updated 5 months ago
- ☆10Updated 8 months ago
- ☆13Updated 9 months ago
- ☆40Updated last month
- ☆41Updated 5 months ago
- ☆38Updated 10 months ago
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- Creating Generative AI Apps which work☆17Updated last month
- ☆43Updated 3 months ago
- A simple Streamlit application to visualize document chunks and queries in embedding space 🗺️🔍☆13Updated last month
- ☆21Updated 3 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- ☆1Updated 10 months ago
- Build Agentic workflows with function calling using open LLMs☆26Updated this week
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 9 months ago
- ☆15Updated last month
- ☆20Updated last month
- AI_Powered_Dev_Search_Engine☆12Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 6 months ago
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- Verifiers for LLM Reinforcement Learning☆55Updated last month
- Explore the use of DSPy for extracting features from PDFs 🔎☆40Updated last year
- Measuring RAG solutions throughput and latency☆17Updated 10 months ago
- Tools for merging pretrained large language models.☆19Updated 11 months ago
- ☆30Updated 10 months ago