felipemaiapolo / prompteval
Efficient multi-prompt evaluation of LLMs
☆19Updated 2 months ago
Alternatives and similar repositories for prompteval:
Users that are interested in prompteval are comparing it to the libraries listed below
- Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.☆122Updated 3 months ago
- Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency☆35Updated last month
- [ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation wi…☆38Updated 7 months ago
- A collection of AWESOME language modeling techniques on tabular data applications.☆28Updated 4 months ago
- This repository contains data, code and models for contextual noncompliance.☆20Updated 7 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆43Updated last month
- This is the official implementation of TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data☆10Updated 7 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆98Updated 4 months ago
- [EMNLP'24] EHRAgent: Code Empowers Large Language Models for Complex Tabular Reasoning on Electronic Health Records☆78Updated last month
- This is the official implementation for our ACL 2024 paper: "Causal Estimation of Memorisation Profiles".☆19Updated 4 months ago
- AbstainQA, ACL 2024☆25Updated 4 months ago
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆16Updated 7 months ago
- ☆24Updated 3 months ago
- Conformal Language Modeling☆28Updated last year
- PASTA: Post-hoc Attention Steering for LLMs☆112Updated 2 months ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆32Updated 3 months ago
- The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph☆14Updated 4 months ago
- ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"☆32Updated last week
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆39Updated 10 months ago
- ☆47Updated last year
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆32Updated 2 months ago
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆19Updated 8 months ago
- Using Explanations as a Tool for Advanced LLMs☆58Updated 5 months ago
- Interpreting Language Models with Contrastive Explanations (EMNLP 2022 Best Paper Honorable Mention)☆62Updated 2 years ago
- [ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf☆22Updated last year
- ☆85Updated last week
- Token-level Reference-free Hallucination Detection☆94Updated last year
- Data and code for the Corr2Cause paper (ICLR 2024)☆93Updated 10 months ago
- Extending Conformal Prediction to LLMs☆63Updated 8 months ago