Efficient multi-prompt evaluation of LLMs
☆33Dec 6, 2024Updated last year
Alternatives and similar repositories for prompteval
Users that are interested in prompteval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The collection of related papers and resources for the paper Time Series Analysis for Education: Methods, Applications, and Future Direct…☆20Apr 12, 2025Updated last year
- The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…☆25May 10, 2024Updated 2 years ago
- Prolog implemented in Python☆12Sep 6, 2024Updated last year
- Evaluating LLMs with fewer examples☆179Apr 12, 2024Updated 2 years ago
- 🗜️Codebase of the ACIP algorithm 🗜️☆18Feb 11, 2026Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch package to train and audit ML models for Individual Fairness☆68Sep 17, 2025Updated 9 months ago
- ☆11May 18, 2025Updated last year
- ☆10Nov 15, 2023Updated 2 years ago
- Codebase for character-centric story understanding☆14Jan 20, 2022Updated 4 years ago
- Conceptual Construct Representations☆11Feb 23, 2023Updated 3 years ago
- [COLM '25] Single-Pass Document Scanning for Question Answering☆14Aug 20, 2025Updated 10 months ago
- compiler project for compiler course (spring 99) in sbu university☆13Nov 21, 2023Updated 2 years ago
- ☆11Mar 12, 2021Updated 5 years ago
- 中科大2022春《深度学习导论》课程资源☆10Aug 7, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 8 months ago
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation☆14Aug 19, 2025Updated 10 months ago
- Generating graph structures from OWL ontologies☆12Nov 21, 2017Updated 8 years ago
- Privateer is a plugin-based framework for security & compliance evaluations.☆21Jun 26, 2026Updated last week
- ☆23Mar 2, 2025Updated last year
- This is the source code for "Dream On". An indie game planned to be released in Fall 2021.☆10Aug 19, 2021Updated 4 years ago
- Reliable Source Approximation: Source-Free Domain Adaptation for Vestibular Schwannoma MRI Segmentation☆11Dec 28, 2024Updated last year
- learn most important part of docker fast and easy☆16May 5, 2020Updated 6 years ago
- Wide Kernel Time-Frequency Fusion (WTFF)--Multi-Domain Time-Frequency Fusion Feature Contrastive Learning for Machinery Fault Diagnosis☆14Mar 21, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆16Feb 8, 2019Updated 7 years ago
- seanlau-flair / unsupervised-remaining-useful-life-prediction-for-bearings-with-virtual-health-index☆10Dec 8, 2022Updated 3 years ago
- Code and data for the paper: DTSM: Toward Dense Table Structure Recognition with Text Query Encoder and Adjacent Feature Aggregator☆13Apr 28, 2024Updated 2 years ago
- Official code of "The Automated but Risky Game: Modeling and Benchmarking Agent-to-Agent Negotiations and Transactions in Consumer Market…☆27Jun 9, 2026Updated 3 weeks ago
- The CSCS ReFrame test suite☆15Jun 28, 2026Updated last week
- r4c☆14Mar 2, 2021Updated 5 years ago
- A Digital Twin prototype for aircraft engine health management in order to identify possible faults and to predict its remaining useful l…☆16Feb 9, 2025Updated last year
- Biomedical Relation Extraction for Transcription Factor and Gene / Gene Products (part of a Master Thesis at Rostlab, TUM)☆12Dec 23, 2017Updated 8 years ago
- Code for verifying deep neural feature ansatz☆22May 3, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)☆36Apr 9, 2026Updated 2 months ago
- This repository contains all the source code needed to reproduce the experiments or review the results obtained in the research paper "…☆13Dec 9, 2023Updated 2 years ago
- Medication Extraction and Reconciliation Knowledge Instrument☆13Jun 17, 2026Updated 2 weeks ago
- The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph☆18Oct 13, 2024Updated last year
- Source code for the paper "Evaluating calibration of deep fault diagnostic models under distribution shift" published in Journal Computer…☆24Jul 3, 2025Updated last year
- Belief in the Machine: Investigating Epistemological Blind Spots of Language Models☆35Apr 19, 2025Updated last year
- ☆16Feb 2, 2025Updated last year