[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
☆40Jul 19, 2024Updated last year
Alternatives and similar repositories for KIEval
Users that are interested in KIEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19Aug 3, 2024Updated last year
- ☆19May 25, 2024Updated 2 years ago
- Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction☆24Sep 30, 2022Updated 3 years ago
- ☆17Feb 28, 2024Updated 2 years ago
- Improving fast adversarial training with prior-guided knowledge (TPAMI2024)☆43Apr 21, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code and Data for GlitchBench☆13Feb 27, 2024Updated 2 years ago
- The repository for paper <Evaluating Open-QA Evaluation>☆25Apr 9, 2024Updated 2 years ago
- Code for Semantic-Aligned Adversarial Evolution Triangle for High-Transferability Vision-Language Attack(TPAMI 2025)☆42Aug 28, 2025Updated 9 months ago
- ☆924May 22, 2024Updated 2 years ago
- Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)☆145Apr 7, 2025Updated last year
- ☆32Jun 12, 2024Updated 2 years ago
- [KDD'22] Partial Label Learning with Discrimination Augmentation☆10May 21, 2024Updated 2 years ago
- ☆23Jan 25, 2023Updated 3 years ago
- 🎉 TrustJudge is accepted to ICLR 2026!☆47Sep 27, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆32Jul 11, 2024Updated last year
- Conversational Recommender System Evaluation via Simulation☆20Updated this week
- Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!☆11Oct 16, 2024Updated last year
- Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"☆15Jan 15, 2023Updated 3 years ago
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated last year
- Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…☆21Mar 7, 2024Updated 2 years ago
- ☆12Jan 20, 2025Updated last year
- ScienceMeter: Tracking Scientific Knowledge Updates in Language Models☆17Jun 28, 2025Updated 11 months ago
- Official repository for Decentralized Arena via Collective LLM Intelligence☆18May 19, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆51Oct 24, 2023Updated 2 years ago
- ☆41Jan 26, 2025Updated last year
- Detect and defend against the nonce race exploit on Polymarket's CTF Exchange☆62Mar 17, 2026Updated 3 months ago
- ☆16Dec 14, 2023Updated 2 years ago
- The code implementation of the paper Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks (A…☆13Jul 16, 2024Updated last year
- Official repository for ICLR 2024 Spotlight paper "Large Language Models Are Not Robust Multiple Choice Selectors"☆43May 20, 2025Updated last year
- I-SHEEP: Iterative Self-enHancEmEnt Paradigm of LLMs through Self-Instruct and Self-Assessment☆17Jan 16, 2025Updated last year
- Code for "FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge". EMNLP 2023.☆20Dec 25, 2023Updated 2 years ago
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [EMNLP 2023 Findings]☆24Nov 18, 2023Updated 2 years ago
- Clean, extensible implementation of MACAW [ICML 2021]☆12Dec 7, 2021Updated 4 years ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆59May 28, 2025Updated last year
- ☆10Nov 7, 2023Updated 2 years ago
- Repository for the ACL 2024 conference website☆18Feb 3, 2025Updated last year
- Code for CVPR2018 "Iterative Learning with Open-set Noisy Labels"☆12Mar 12, 2021Updated 5 years ago
- This repository includes the code implementation of the paper Improving Pacing in Long-Form Story Planning by Yichen Wang, Kevin Yang, Xi…☆17Nov 19, 2024Updated last year