IBM / eval-assistLinks
EvalAssist is an open-source project that simplifies using large language models as evaluators (LLM-as-a-Judge) of the output of other large language models by supporting users in iteratively refining evaluation criteria in a web-based user experience.
☆92Updated 3 weeks ago
Alternatives and similar repositories for eval-assist
Users that are interested in eval-assist are comparing it to the libraries listed below
Sorting:
- A framework for fine-tuning retrieval-augmented generation (RAG) systems.☆137Updated this week
- LangFair is a Python library for conducting use-case level LLM bias and fairness assessments☆243Updated this week
- A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use…☆162Updated 2 weeks ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆119Updated 8 months ago
- A small library of LLM judges☆308Updated 4 months ago
- ☆148Updated last year
- This repository stems from our paper, “Cataloguing LLM Evaluations”, and serves as a living, collaborative catalogue of LLM evaluation fr…☆18Updated 2 years ago
- Granite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models☆329Updated last week
- ☆103Updated 8 months ago
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆77Updated 7 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆114Updated last year
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆212Updated this week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆180Updated last year
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, …☆388Updated this week
- Materials for the Ultimate Hybrid Search Workshop☆44Updated last year
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆54Updated 3 weeks ago
- A practical RAG where you can download and chat with github repo☆94Updated 10 months ago
- ☆38Updated last year
- An agentic AI application that allows you to chat with your papers and gather also information from papers on ArXiv and on PubMed☆154Updated 7 months ago
- Build Enterprise RAG (Retriver Augmented Generation) Pipelines to tackle various Generative AI use cases with LLM's by simply plugging co…☆116Updated last year
- Generalist and Lightweight Model for Text Classification☆167Updated 2 weeks ago
- 🤗 Benchmark Large Language Models Reliably On Your Data☆419Updated this week
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆61Updated 10 months ago
- this project will bootstrap and scaffold the projects for specific semantic search and RAG applications along with regular boiler plate c…☆91Updated last year
- Official Implementation of "Affordable AI Assistants with Knowledge Graph of Thoughts"☆200Updated last week
- A Lightweight Library for AI Observability☆252Updated 10 months ago
- Semantic Chunker is a lightweight Python package for semantically-aware chunking and clustering of text.☆285Updated 8 months ago
- Sample notebooks and prompts for LLM evaluation☆156Updated last month
- A curated list of materials on AI guardrails☆43Updated 6 months ago