aws-samples / evaluating-large-language-models-using-llm-as-a-judge
☆16Updated 3 months ago
Alternatives and similar repositories for evaluating-large-language-models-using-llm-as-a-judge:
Users that are interested in evaluating-large-language-models-using-llm-as-a-judge are comparing it to the libraries listed below
- ☆19Updated 6 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- ☆41Updated 4 months ago
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 6 months ago
- AI_Powered_Dev_Search_Engine☆12Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 5 months ago
- ☆1Updated 9 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Updated last year
- ☆20Updated last month
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆21Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆77Updated 6 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 4 months ago
- ☆53Updated 4 months ago
- a simple create-llama template using llama-index v0.10 and integrated with Ollama☆10Updated 10 months ago
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated last year
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆21Updated last week
- ☆45Updated 6 months ago
- ☆40Updated 2 months ago
- ☆39Updated this week
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- ☆10Updated 6 months ago
- ☆48Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 5 months ago
- Creating Generative AI Apps which work☆17Updated 9 months ago
- "Syntriever: How to Train Your Retriever with Synthetic Data from LLMs" the Nations of the Americas Chapter of the Association for Comput…☆24Updated last month
- This repository contains the source code for running llamaindex tutorials from https://howaibuildthis.substack.com/☆40Updated last year
- Aioli: A unified optimization framework for language model data mixing☆23Updated 2 months ago
- ☆16Updated 6 months ago