THU-KEG / KoLAView external linksLinks
[ICLR24] The open-source repo of THU-KEG's KoLA benchmark.
☆52Sep 28, 2023Updated 2 years ago
Alternatives and similar repositories for KoLA
Users that are interested in KoLA are comparing it to the libraries listed below
Sorting:
- Repo for the question-in-context rewriting baseline presented in Elgohary et al. "Can you unpack that? Learning to rewrite questions-in-c…☆23May 20, 2020Updated 5 years ago
- [NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.☆23Jul 26, 2023Updated 2 years ago
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆11Sep 21, 2024Updated last year
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)☆19Jul 19, 2025Updated 6 months ago
- Xlore2.0 Code[BaiduExtractor, HudongExtractor, WikiExtractor, XloreData, XloreWeb]☆12Apr 5, 2017Updated 8 years ago
- Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs☆14Updated this week
- Medical multi-modal learning with missing modality data (MLHC 2023)☆14Aug 1, 2023Updated 2 years ago
- Repository for Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts, EMNLP22☆19Jun 23, 2023Updated 2 years ago
- ☆17Aug 7, 2024Updated last year
- The official code and dataset for EMNLP 2022 paper "COPEN: Probing Conceptual Knowledge in Pre-trained Language Models".☆21Mar 9, 2023Updated 2 years ago
- ☆19Feb 3, 2022Updated 4 years ago
- [ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling☆20Sep 18, 2023Updated 2 years ago
- GLM-SIMPLE-EVALS: The evaluation repository for the GLM-4.5 series of models by Z.ai.☆39Oct 17, 2025Updated 3 months ago
- A Bilingual Role Evaluation Benchmark for Large Language Models☆43Jan 9, 2024Updated 2 years ago
- SUPERVAIZER is a toolkit built for the age of AI interoperability. At its core, it implements Google's Agent-to-Agent (A2A) protocol, ena…☆14Feb 4, 2026Updated last week
- This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.☆16Mar 20, 2023Updated 2 years ago
- Repository for Decomposed Prompting☆95Nov 15, 2023Updated 2 years ago
- ☆58Jun 30, 2023Updated 2 years ago
- Code and dataset for the ACL 2021 paper "TWAG: A Topic-guided Wikipedia Abstract Generator"☆20Aug 9, 2021Updated 4 years ago
- ⏳ ChatLog: Recording and Analysing ChatGPT Across Time☆103May 30, 2024Updated last year
- Dataset of Clarification Questions☆21Jun 15, 2020Updated 5 years ago
- ☆99Dec 5, 2023Updated 2 years ago
- ☆37Jan 25, 2024Updated 2 years ago
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.☆552Feb 12, 2024Updated 2 years ago
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆60May 17, 2023Updated 2 years ago
- [EMNLP2024] Benchmark for "Large Language Models Are Poor Clinical Decision-Makers: A Comprehensive Benchmark"☆35Sep 18, 2025Updated 4 months ago
- The AI Radiologist You Can Chat With☆24Aug 4, 2023Updated 2 years ago
- TRAM: Benchmarking Temporal Reasoning for Large Language Models (Findings of ACL 2024)☆26Jun 21, 2024Updated last year
- 中文大语言模型评测第一期☆112Oct 23, 2023Updated 2 years ago
- ☆30Sep 5, 2021Updated 4 years ago
- ☆28Sep 21, 2024Updated last year
- Resource, Evaluation and Detection Papers for ChatGPT☆456Mar 21, 2024Updated last year
- Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)☆51May 12, 2025Updated 9 months ago
- ☆28Nov 29, 2022Updated 3 years ago
- [NeurIPS 2023] TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph☆38Oct 17, 2025Updated 3 months ago
- Implementation of ICML 23 Paper: Specializing Smaller Language Models towards Multi-Step Reasoning.☆132Jun 18, 2023Updated 2 years ago
- ☆12Sep 25, 2023Updated 2 years ago
- Self hosted AI workflow for scraping Instagram Reels (audio and description). Extracting, summarising and categorising, then storing all …☆27Jan 10, 2026Updated last month