zhuohaoyu/KIEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhuohaoyu/KIEval)

zhuohaoyu / KIEval

[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

☆40

Alternatives and similar repositories for KIEval

Users that are interested in KIEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TOWESSL / TOWESSL
View on GitHub
Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction
☆24Sep 30, 2022Updated 3 years ago
WisdomShell / ujb
View on GitHub
☆17Feb 28, 2024Updated 2 years ago
jiaxiaojunQAQ / FGSM-PGK
View on GitHub
Improving fast adversarial training with prior-guided knowledge (TPAMI2024)
☆43Apr 21, 2024Updated 2 years ago
GlitchBench / Benchmark
View on GitHub
Code and Data for GlitchBench
☆13Feb 27, 2024Updated 2 years ago
tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TrustJudge / TrustJudge
View on GitHub
🎉 TrustJudge is accepted to ICLR 2026!
☆49Sep 27, 2025Updated 10 months ago
WeOpenML / PandaLM
View on GitHub
☆926May 22, 2024Updated 2 years ago
chentong0 / copy-bench
View on GitHub
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
☆14Aug 19, 2025Updated 11 months ago
jiaxiaojunQAQ / I-GCG
View on GitHub
Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)
☆146Apr 7, 2025Updated last year
sarrouti / HealthVer
View on GitHub
☆20Feb 3, 2022Updated 4 years ago
wwangwitsel / PLDA
View on GitHub
[KDD'22] Partial Label Learning with Discrimination Augmentation
☆10May 21, 2024Updated 2 years ago
liujch1998 / memo-trap
View on GitHub
☆23Jan 25, 2023Updated 3 years ago
tlringer / proof-chat-fun
View on GitHub
playing with gpt4
☆13Mar 17, 2023Updated 3 years ago
zhu-minjun / SafetyLock
View on GitHub
Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!
☆11Oct 16, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lifan-yuan / FactMix
View on GitHub
Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"
☆15Jan 15, 2023Updated 3 years ago
lyh6560new / P3Sum
View on GitHub
The offical code for paper "What Constitutes a Faithful Summary? Preserving Author Perspectives in News Summarization"
☆10Jun 23, 2024Updated 2 years ago
sail-sg / dice
View on GitHub
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆47Apr 15, 2025Updated last year
YangLinyi / GLUE-X
View on GitHub
We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …
☆100Aug 15, 2023Updated 2 years ago
WisdomShell / ADG
View on GitHub
[ACL'26 Main Conference] Instruction Data Selection via Answer Divergence
☆22Apr 14, 2026Updated 3 months ago
maitrix-org / de-arena
View on GitHub
Official repository for Decentralized Arena via Collective LLM Intelligence
☆18May 19, 2025Updated last year
LetterLiGo / Inaudible-Adversarial-Perturbation-Vrifle
View on GitHub
[NDSS'24] Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time
☆56Sep 28, 2024Updated last year
stellalisy / mediQ
View on GitHub
☆44Jan 26, 2025Updated last year
vidhishanair / FactEdit
View on GitHub
☆14Aug 30, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jaehunjung1 / Maieutic-Prompting
View on GitHub
☆52Oct 24, 2023Updated 2 years ago
YichenZW / Robust-Det
View on GitHub
The code implementation of the paper Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks (A…
☆13Jul 16, 2024Updated 2 years ago
BunsenFeng / FactKB
View on GitHub
Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.
☆20Dec 25, 2023Updated 2 years ago
iai-group / UserSimCRS
View on GitHub
Conversational Recommender System Evaluation via Simulation
☆22Jul 21, 2026Updated last week
multimodal-art-projection / I-SHEEP
View on GitHub
I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment
☆17Jan 16, 2025Updated last year
yikee / Knowledge_Conflict
View on GitHub
Resolving Knowledge Conflicts in Large Language Models, COLM 2024
☆18Oct 7, 2025Updated 9 months ago
luisf-gomez / Explorer-FE-AU-in-PD
View on GitHub
This GitHub provides the source code for the paper "Exploring Facial Expression and Action Units in Parkinson Disease"
☆10Dec 21, 2022Updated 3 years ago
waltonfuture / Diff-eRank
View on GitHub
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆59May 28, 2025Updated last year
hrwise-nlp / Cue-CoT
View on GitHub
Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [EMNLP 2023 Findings]
☆24Nov 18, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yikee / ScienceMeter
View on GitHub
ScienceMeter: Tracking Scientific Knowledge Updates in Language Models, COLM 2026
☆17Jun 28, 2025Updated last year
WisdomShell / ETC
View on GitHub
[AAAI'26, Oral] Modeling Uncertainty Trends for Timely Retrieval in Dynamic RAG
☆32Apr 14, 2026Updated 3 months ago
acl-org / acl-2024
View on GitHub
Repository for the ACL 2024 conference website
☆18Feb 3, 2025Updated last year
PROPHETE-pro / MaterialSeg3D_
View on GitHub
☆10Oct 22, 2024Updated last year
CMMMU-Benchmark / CMMMU
View on GitHub
☆48Sep 5, 2024Updated last year
StevenZHB / CoT_Causal_Analysis
View on GitHub
Repository of paper "How Likely Do LLMs with CoT Mimic Human Reasoning?"
☆23Feb 19, 2025Updated last year
xhan77 / jpeg-lm
View on GitHub
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
☆16Sep 29, 2024Updated last year