Efficient multi-prompt evaluation of LLMs
☆33Dec 6, 2024Updated last year
Alternatives and similar repositories for prompteval
Users that are interested in prompteval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- End-to-End Ontology Learning with Large Language Models, NeurIPS 2024.☆54Nov 6, 2024Updated last year
- ☆11Sep 10, 2023Updated 2 years ago
- simulate linkstate algorithm for routing☆10Nov 6, 2023Updated 2 years ago
- Evaluating LLMs with fewer examples☆175Apr 12, 2024Updated 2 years ago
- Repository for the paper: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning☆18Feb 21, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Codebase for character-centric story understanding☆14Jan 20, 2022Updated 4 years ago
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Models☆17Jun 28, 2025Updated 10 months ago
- This repository contains an implementation of the simple yet powerful state machine agentic algorithm.☆22Sep 29, 2025Updated 7 months ago
- compiler project for compiler course (spring 99) in sbu university☆13Nov 21, 2023Updated 2 years ago
- ☆11Mar 12, 2021Updated 5 years ago
- ☆16Jul 11, 2023Updated 2 years ago
- This is a list of Persian foods☆13Oct 1, 2020Updated 5 years ago
- Converts brat standoff format to JSONL format☆13Jan 29, 2022Updated 4 years ago
- Helm chart for tile38☆15Mar 30, 2026Updated last month
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- https://openreview.net/forum?id=OC1o4_OI6Jw☆13May 27, 2022Updated 3 years ago
- CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation☆14Aug 19, 2025Updated 8 months ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- Generating graph structures from OWL ontologies☆12Nov 21, 2017Updated 8 years ago
- VAE+GAN☆10Apr 18, 2018Updated 8 years ago
- multi_gpu_infer 多gpu预测 multiprocessing or subprocessing☆12Mar 24, 2020Updated 6 years ago
- Sharif-AI-Challenge2021 Client☆11Aug 20, 2021Updated 4 years ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24May 24, 2022Updated 3 years ago
- ☆23Mar 2, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [NAACL 2024] Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers https://arxiv.org/abs/2307.…☆17Jan 27, 2024Updated 2 years ago
- Docker compose for starting local OpenML instances☆11Jan 13, 2023Updated 3 years ago
- A fork of BlenderProc used in the GRADE framework to generate environments and export some additional information for processing.☆10Mar 9, 2023Updated 3 years ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆112Nov 11, 2024Updated last year
- learn most important part of docker fast and easy☆16May 5, 2020Updated 6 years ago
- TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms and their Relations☆11May 24, 2017Updated 8 years ago
- ☆16Feb 8, 2019Updated 7 years ago
- PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions (NeurIPS 2025 D&B track, Spotlight)☆30Apr 9, 2026Updated last month
- Clinical NLP concept extraction of ADEs in the 2018 n2c2 Adverse Drug Events and Medication Extraction (Track 2). Includes data preproce…☆16Nov 21, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Hypothesis strategies for various Pytorch structures (including tensors and modules).☆13Updated this week
- The CSCS ReFrame test suite