SLAB-NLP/Multi-Prompt-LLM-Evaluation

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/SLAB-NLP/Multi-Prompt-LLM-Evaluation)

SLAB-NLP / Multi-Prompt-LLM-Evaluation

State of What Art? A Call for Multi-Prompt LLM Evaluation

☆16

Alternatives and similar repositories for Multi-Prompt-LLM-Evaluation

Users that are interested in Multi-Prompt-LLM-Evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

eliyahabba / PromptSuite
View on GitHub
☆16Nov 24, 2025Updated 8 months ago
edahanoam / Awesome-Summarization-Datasets
View on GitHub
Updating collection of summarization datasets in 100+ languages, based on our paper "The State and Fate of Summarization Datasets: A Surv…
☆31Apr 29, 2025Updated last year
yrf1 / LLM-MassiveMulticultureNormsKnowledge-NCLB
View on GitHub
☆20Mar 12, 2025Updated last year
orensul / analogies_mining
View on GitHub
☆21Mar 19, 2024Updated 2 years ago
YichenZW / awesome-llm-diversity
View on GitHub
A curated collection of research papers exploring diversity in Large Language Model text generation. This repository tracks cutting-edge …
☆15Jun 19, 2026Updated last month
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
assafbk / OPRM
View on GitHub
Overflow Prevention Enhances Long-Context Recurrent LLMs (COLM 2025)
☆18Jul 8, 2025Updated last year
siddheshih / culture-awareness-llms
View on GitHub
☆20Nov 7, 2024Updated last year
schwartz-lab-NLP / Tokens2Words
View on GitHub
☆16Apr 2, 2025Updated last year
Itaymanes / K-QA
View on GitHub
Dataset and Evaluation Code for the K-QA Benchmark.
☆18May 26, 2024Updated 2 years ago
msra-nlc / MSParS_V2.0
View on GitHub
☆25Feb 24, 2020Updated 6 years ago
mickymultani / nvidia-NIM-RAG
View on GitHub
Project demonstrates the power and simplicity of NVIDIA NIM (NVIDIA Inference Model), a suite of optimized cloud-native microservices, by…
☆16Mar 21, 2024Updated 2 years ago
niveck / LLMafia
View on GitHub
Asynchronous LLM Agent playing games of Mafia against human players
☆23Nov 12, 2025Updated 8 months ago
jiyounglee-0523 / VisAlign
View on GitHub
☆20Apr 23, 2024Updated 2 years ago
biasinrecsys / wsdm2021
View on GitHub
WSDM 2021 Tutorial on Advances in Bias-aware Recommendation on the Web
☆11Mar 8, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
levymsn / CQA-CRCT
View on GitHub
Official PyTorch implementation for ״ lassification-Regression for Chart Comprehension״
☆26Feb 5, 2025Updated last year
dimalik / prediction_error
View on GitHub
Neural embeddings with negative sampling in Keras
☆11Jun 11, 2017Updated 9 years ago
j-min / IterInpaint
View on GitHub
Code for IterInpaint model, presented in Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation (CVPR 2024 work…
☆25Jul 21, 2024Updated 2 years ago
aiovine / converse-dataset
View on GitHub
Natural language dataset for training a Conversational Recommender System
☆11Jul 9, 2019Updated 7 years ago
eliorsulem / simplification-acl2018
View on GitHub
Human Evaluation Benchmark for Text Simplification
☆10Sep 6, 2018Updated 7 years ago
leroy9472 / InMind
View on GitHub
☆15Nov 18, 2025Updated 8 months ago
gmftbyGMFTBY / MomentumDecoding
View on GitHub
Momentum Decoding: Open-ended Text Generation as Graph Exploration
☆19Jan 27, 2023Updated 3 years ago
bbc / dsrp_bbcavs10k_distribution
View on GitHub
Repo for the BBCAVS10k distribution
☆10Nov 27, 2024Updated last year
HAE-RAE / HAERAE-VISION
View on GitHub
Evaluation code for HAERAE-Vision benchmark
☆15Apr 29, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
timjogorman / Multisentence-AMR-guidelines
View on GitHub
Guidelines for our secondary layer of annotation adding multi-sentence AMR links
☆12Sep 6, 2017Updated 8 years ago
Victorwz / VaLM
View on GitHub
VaLM: Visually-augmented Language Modeling. ICLR 2023.
☆56Mar 6, 2023Updated 3 years ago
jfilliben / poker-sim
View on GitHub
Python implementation of a Texas Hold'em Monte Carlo Simulator
☆19Dec 30, 2015Updated 10 years ago
DCSaunders / gender-debias
View on GitHub
Adaptation datasets and scripts for the paper "Reducing gender bias in Neural Machine Translation as a domain adaptation problem" (ACL 20…
☆13Mar 18, 2021Updated 5 years ago
gregdeon / spotlight
View on GitHub
Implementation of the spotlight: a method for discovering systematic errors in deep learning models
☆11Oct 5, 2021Updated 4 years ago
deep-diver / janus
View on GitHub
generate synthetic data for LLM fine-tuning in arbitrary situations within systematic way
☆22Mar 18, 2024Updated 2 years ago
gucci-j / light-transformer-emnlp2021
View on GitHub
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
☆34Nov 21, 2021Updated 4 years ago
quadrismegistus / lltk
View on GitHub
Literary Language Toolkit: code, models, corpora, and web tools
☆11Jul 5, 2026Updated 2 weeks ago
biasinrecsys / umap2020
View on GitHub
ACM UMAP2020 Hands-on Tutorial on Data and Algorithmic Bias in Recommender Systems
☆10May 23, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
MohammadHeydari / Persian_FastText
View on GitHub
Persian Word Embedding Using FastText Pre-trained Model
☆13May 29, 2026Updated last month
ssu-humane / K-HATERS
View on GitHub
Hate speech detection corpus in Korean, shared with EMNLP 2023 paper
☆17Apr 19, 2024Updated 2 years ago
allenai / sledgehammer
View on GitHub
☆48Jun 8, 2020Updated 6 years ago
Kaleidophon / awesome-experimental-standards-deep-learning
View on GitHub
Repository collecting resources and best practices to improve experimental rigour in deep learning research.
☆27Mar 30, 2023Updated 3 years ago
asappresearch / interactive-classification
View on GitHub
☆15Feb 24, 2021Updated 5 years ago
julianmichael / qasrl
View on GitHub
Tools for working with QA-SRL data and annotating it with crowdsourcing.
☆13Sep 22, 2023Updated 2 years ago
sEhsanTaher / Beheshti-NER
View on GitHub
Beheshti-NER: Persian named entity recognition Using BERT
☆14May 16, 2021Updated 5 years ago