Efficient multi-prompt evaluation of LLMs
☆31Dec 6, 2024Updated last year
Alternatives and similar repositories for prompteval
Users that are interested in prompteval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- End-to-End Ontology Learning with Large Language Models, NeurIPS 2024.☆51Nov 6, 2024Updated last year
- Prolog implemented in Python☆12Sep 6, 2024Updated last year
- ☆11Sep 10, 2023Updated 2 years ago
- ☆12Nov 2, 2021Updated 4 years ago
- simulate linkstate algorithm for routing☆10Nov 6, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Evaluating LLMs with fewer examples☆173Apr 12, 2024Updated 2 years ago
- Repository for the paper: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning☆18Feb 21, 2025Updated last year
- 🗜️Codebase of the ACIP algorithm 🗜️☆18Feb 11, 2026Updated 2 months ago
- ☆11May 18, 2025Updated 11 months ago
- ☆10Nov 15, 2023Updated 2 years ago
- [COLM '25] Single-Pass Document Scanning for Question Answering☆13Aug 20, 2025Updated 7 months ago
- This repository contains an implementation of the simple yet powerful state machine agentic algorithm.☆22Sep 29, 2025Updated 6 months ago
- Use shecan in bash with ease☆15Feb 8, 2019Updated 7 years ago
- ☆16Jul 11, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- Generating graph structures from OWL ontologies☆12Nov 21, 2017Updated 8 years ago
- Privateer is a plugin-based framework for security & compliance evaluations.☆19Updated this week
- Sharif-AI-Challenge2021 Client☆11Aug 20, 2021Updated 4 years ago
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)☆26Apr 9, 2026Updated last week
- ☆22Mar 2, 2025Updated last year
- [NAACL 2024] Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers https://arxiv.org/abs/2307.…☆17Jan 27, 2024Updated 2 years ago
- This is the source code for "Dream On". An indie game planned to be released in Fall 2021.☆10Aug 19, 2021Updated 4 years ago
- learn most important part of docker fast and easy☆16May 5, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms and their Relations☆11May 24, 2017Updated 8 years ago
- Wide Kernel Time-Frequency Fusion (WTFF)--Multi-Domain Time-Frequency Fusion Feature Contrastive Learning for Machinery Fault Diagnosis☆14Mar 21, 2025Updated last year
- ☆16Feb 8, 2019Updated 7 years ago
- seanlau-flair / unsupervised-remaining-useful-life-prediction-for-bearings-with-virtual-health-index☆10Dec 8, 2022Updated 3 years ago
- Notebooks for 6.S088 IAP 2023☆16Aug 1, 2024Updated last year
- A Digital Twin prototype for aircraft engine health management in order to identify possible faults and to predict its remaining useful l…☆13Feb 9, 2025Updated last year
- Website for visualizing predicted drug side-effects using L1000 data (http://maayanlab.net/SEP-L1000/)☆10Apr 15, 2022Updated 4 years ago
- Belief in the Machine: Investigating Epistemological Blind Spots of Language Models☆32Apr 19, 2025Updated last year
- Biomedical Relation Extraction for Transcription Factor and Gene / Gene Products (part of a Master Thesis at Rostlab, TUM)☆12Dec 23, 2017Updated 8 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for verifying deep neural feature ansatz☆22May 3, 2023Updated 2 years ago
- This repository contains all the source code needed to reproduce the experiments or review the results obtained in the research paper "…☆13Dec 9, 2023Updated 2 years ago
- The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph☆18Oct 13, 2024Updated last year
- ☆15Apr 8, 2026Updated last week
- Source code for the paper "Evaluating calibration of deep fault diagnostic models under distribution shift" published in Journal Computer…☆23Jul 3, 2025Updated 9 months ago
- This repository is a PyTorch implementation for NIPS 2024 Paper "Reinforced Cross-Domain Knowledge Distillation on Time Series Data".☆16Sep 26, 2024Updated last year
- This is official implementation of "Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise"…☆21Mar 18, 2025Updated last year