aws-samples / evaluating-large-language-models-using-llm-as-a-judge
☆18Updated 3 months ago
Alternatives and similar repositories for evaluating-large-language-models-using-llm-as-a-judge:
Users that are interested in evaluating-large-language-models-using-llm-as-a-judge are comparing it to the libraries listed below
- ☆41Updated 4 months ago
- ☆19Updated 6 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- ☆29Updated 2 months ago
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- ☆1Updated 9 months ago
- Question Answering Generative AI application with Large Language Models (LLMs) and Amazon OpenSearch Service☆25Updated 5 months ago
- ☆29Updated last year
- This repository is a combination of llama workflows and agents together which is a powerful concept.☆17Updated 8 months ago
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 7 months ago
- Creating Generative AI Apps which work☆17Updated 3 weeks ago
- ☆32Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- ☆32Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- Adding NeMo Guardrails to a LlamaIndex RAG pipeline☆37Updated last year
- A simple Streamlit application to visualize document chunks and queries in embedding space 🗺️🔍☆13Updated 3 weeks ago
- ☆45Updated 7 months ago
- A Hands-on Practical Guide to LlamaIndex☆33Updated 6 months ago
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Updated last year
- ☆13Updated 8 months ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 7 months ago
- CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments☆54Updated 2 months ago
- Experimentation on google's gemma model☆16Updated last year
- "Syntriever: How to Train Your Retriever with Synthetic Data from LLMs" the Nations of the Americas Chapter of the Association for Comput…☆25Updated 2 months ago
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆35Updated last year
- ☆19Updated 3 weeks ago
- ☆53Updated 5 months ago