prometheus-eval / prometheus-eval
Evaluate your LLM's response with Prometheus and GPT4 π―
β877Updated last month
Alternatives and similar repositories for prometheus-eval:
Users that are interested in prometheus-eval are comparing it to the libraries listed below
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ1,238Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,517Updated this week
- β498Updated 3 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,314Updated last week
- Automatically evaluate your LLMs in Google Colabβ593Updated 9 months ago
- Automated Evaluation of RAG Systemsβ554Updated 3 months ago
- Official repository for ORPOβ441Updated 9 months ago
- Official repository for ICLR 2025 paper "Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing". Your efficient anβ¦β638Updated 2 weeks ago
- Generative Representational Instruction Tuningβ599Updated last month
- Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"β1,048Updated last week
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€β973Updated last month
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ497Updated 8 months ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β2,261Updated 2 weeks ago
- A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.β743Updated this week
- Stanford NLP Python library for Representation Finetuning (ReFT)β1,431Updated 3 weeks ago
- Chat Templates for π€ HuggingFace Large Language Modelsβ619Updated 2 months ago
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).β808Updated this week
- An Open Source Toolkit For LLM Distillationβ516Updated last month
- Doing simple retrieval from LLM models at various context lengths to measure accuracyβ1,730Updated 6 months ago
- A library for easily merging multiple LLM experts, and efficiently train the merged LLM.β445Updated 6 months ago
- Best practices for distilling large language models.β491Updated last year
- Code for Quiet-STaRβ716Updated 6 months ago
- β1,006Updated 2 months ago
- A reading list on LLM based Synthetic Data Generation π₯β1,176Updated last week
- Train Models Contrastively in Pytorchβ652Updated last week
- [ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuningβ349Updated 5 months ago
- β815Updated 5 months ago
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.β2,110Updated this week
- Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard aβ¦β1,049Updated last month
- β366Updated last month