felipemaiapolo / promptevalLinks
Efficient multi-prompt evaluation of LLMs
☆26Updated last year
Alternatives and similar repositories for prompteval
Users that are interested in prompteval are comparing it to the libraries listed below
Sorting:
- Discovering Data-driven Hypotheses in the Wild☆124Updated 6 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆47Updated 9 months ago
- Optimize Any User-defined Compound AI Systems☆65Updated 4 months ago
- Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.☆133Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆112Updated last year
- ☆28Updated 8 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆107Updated 2 years ago
- Codebase accompanying the Summary of a Haystack paper.☆80Updated last year
- ☆94Updated 10 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆102Updated last year
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆92Updated last year
- Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"☆71Updated last year
- Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"☆19Updated 6 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆91Updated last year
- LangCode - Improving alignment and reasoning of large language models (LLMs) with natural language embedded program (NLEP).☆48Updated 2 years ago
- Evaluating LLMs with fewer examples☆170Updated last year
- ☆43Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Updated 10 months ago
- A toolkit to induce interpretable workflows from raw computer-use activities.☆35Updated last month
- [EMNLP'24] EHRAgent: Code Empowers Large Language Models for Complex Tabular Reasoning on Electronic Health Records☆118Updated last year
- Leveraging Base Language Models for Few-Shot Synthetic Data Generation☆40Updated 2 months ago
- ☆90Updated last week
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆222Updated 2 weeks ago
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Models☆17Updated 6 months ago
- Official implementation of "BERTs are Generative In-Context Learners"☆32Updated 9 months ago
- [ACL 2024] <Large Language Models for Automated Open-domain Scientific Hypotheses Discovery>. It has also received the best poster award …☆42Updated last year
- ☆28Updated 10 months ago
- Aioli: A unified optimization framework for language model data mixing☆31Updated 11 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆38Updated last year