rungalileo / hallucination-indexLinks
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
☆110Updated 8 months ago
Alternatives and similar repositories for hallucination-index
Users that are interested in hallucination-index are comparing it to the libraries listed below
Sorting:
- This repository implements the chain of verification paper by Meta AI☆169Updated last year
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆104Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- ☆89Updated last year
- ☆77Updated 11 months ago
- Repository to demonstrate Chain of Table reasoning with multiple tables powered by LangGraph☆144Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆48Updated last year
- Simple examples using Argilla tools to build AI☆53Updated 6 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆111Updated last week
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆81Updated last year
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆172Updated 8 months ago
- ☆143Updated 10 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆99Updated last year
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆110Updated 8 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆105Updated last month
- A semantic research engine to get relevant papers based on a user query. Application frontend with Chainlit Copilot. Observability with L…☆82Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆77Updated 7 months ago
- ☆38Updated 10 months ago
- ☆74Updated 4 months ago
- Sample notebooks and prompts for LLM evaluation☆128Updated last week
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆81Updated 8 months ago
- GenAIOps on Kubernetes: A collection of reference architectures for running GenAI at scale on Kubernetes using OSS tooling☆130Updated 7 months ago
- ☆72Updated 7 months ago
- Open-source RAG evaluation through users' feedback☆185Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 10 months ago
- Collection of recipes aiding Gen AI model development☆109Updated last week
- Using open source LLMs to build synthetic datasets for direct preference optimization☆63Updated last year
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆78Updated 2 months ago