felipemaiapolo / prompteval
Efficient multi-prompt evaluation of LLMs
☆19Updated 5 months ago
Alternatives and similar repositories for prompteval
Users that are interested in prompteval are comparing it to the libraries listed below
Sorting:
- 🌾 Universal, customizable and deployable fine-grained evaluation for text generation.☆23Updated last year
- ☆71Updated 7 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆105Updated 7 months ago
- Few-shot Learning with Auxiliary Data☆27Updated last year
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆29Updated 8 months ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆42Updated 2 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆44Updated 10 months ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 3 years ago
- Official Repository for Dataset Inference for LLMs☆33Updated 9 months ago
- Tree prompting: easy-to-use scikit-learn interface for improved prompting.☆37Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆84Updated 5 months ago
- [NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.☆22Updated last year
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆35Updated 8 months ago
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆68Updated 10 months ago
- ☆14Updated last year
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Updated 3 months ago
- ☆28Updated 10 months ago
- In-context Example Selection with Influences☆15Updated 2 years ago
- ☆28Updated 2 months ago
- Token-level Reference-free Hallucination Detection☆94Updated last year
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆30Updated 3 months ago
- Adding new tasks to T0 without catastrophic forgetting☆33Updated 2 years ago
- ☆29Updated last year
- Evaluation of neuro-symbolic engines☆35Updated 9 months ago
- ✨ Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆16Updated 7 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆45Updated 6 months ago
- Using Explanations as a Tool for Advanced LLMs☆60Updated 8 months ago
- Code and Data for the NAACL 24 paper: MacGyver: Are Large Language Models Creative Problem Solvers?☆28Updated last year