deshwalmahesh / PHUDGELinks

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

☆49

Alternatives and similar repositories for PHUDGE

Users that are interested in PHUDGE are comparing it to the libraries listed below

Sorting:

salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated 10 months ago
davanstrien / data-for-fine-tuning-llms
☆77Updated last year
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 8 months ago
apple / ml-superposition-prompting
☆145Updated last year
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
flowaicom / flow-judge
Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…
☆76Updated 9 months ago
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆105Updated 7 months ago
pacman100 / peft-codegen-25
☆23Updated 2 years ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago
padas-lab-de / ir-rag-sigir24-persona-rag
☆47Updated 10 months ago
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆65Updated last year
arcee-ai / DAM
☆53Updated 8 months ago
zetaalphavector / RAGElo
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
☆114Updated 3 weeks ago
mickymultani / RAG-with-Cross-Encoder-Reranker
Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.
☆48Updated last year
TIGER-AI-Lab / StructLM
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆75Updated 9 months ago
PrithivirajDamodaran / Route0x
Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da
☆108Updated 4 months ago
Upaya07 / NeurIPS-llm-efficiency-challenge
Code for NeurIPS LLM Efficiency Challenge
☆59Updated last year
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 2 months ago
geronimi73 / phi2-finetune
☆87Updated last year
S1M0N38 / dspy-arxiv
Explore the use of DSPy for extracting features from PDFs 🔎
☆45Updated last year
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆77Updated 9 months ago
weaviate-tutorials / Hurricane
Writing Blog Posts with Generative Feedback Loops!
☆50Updated last year
Hannibal046 / nanoColBERT
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆80Updated last year
hyintell / RetrievalQA
Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…
☆66Updated last year
writer / writing-in-the-margins
☆118Updated 11 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 8 months ago
davidberenstein1957 / dataset-viber
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
☆47Updated 10 months ago
PrithivirajDamodaran / blitz-embed
C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…
☆22Updated last year
orionw / promptriever
The first dense retrieval model that can be prompted like an LM
☆81Updated 2 months ago
mrmps / ai-chunker
Chunk your text using gpt4o-mini more accurately
☆44Updated 11 months ago