[ICML'2024] Can AI Assistants Know What They Don't Know?
☆85Feb 5, 2024Updated 2 years ago
Alternatives and similar repositories for Say-I-Dont-Know
Users that are interested in Say-I-Dont-Know are comparing it to the libraries listed below
Sorting:
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆130Jul 10, 2024Updated last year
- Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge☆14Feb 20, 2024Updated 2 years ago
- ☆78May 22, 2024Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆136Jun 5, 2024Updated last year
- Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"☆69Feb 27, 2024Updated 2 years ago
- Grade-School Math with Irrelevant Context (GSM-IC) benchmark is an arithmetic reasoning dataset built upon GSM8K, by adding irrelevant se…☆65Feb 13, 2023Updated 3 years ago
- The codebase for "Learning from Easy to Complex: Adaptive Multi-curricula Learning for Neural Dialogue Generation" (Cai et al., AAAI 2020…☆20Jun 18, 2024Updated last year
- [Findings of EMNLP'2024] Unified Active Retrieval for Retrieval Augmented Generation☆23Sep 30, 2024Updated last year
- ☆13Jan 14, 2026Updated last month
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- ☆43Sep 3, 2024Updated last year
- code for Preprint paper at Arxiv: MoT: Pre-thinking and Recalling Enable ChatGPT to Self-Improve with Memory-of-Thoughts☆24Nov 29, 2023Updated 2 years ago
- Code for Findings of ACL 2021 paper "Addressing Inquiries about History: An Efficient and Practical Framework for Evaluating Open-domain …☆19Dec 16, 2022Updated 3 years ago
- Optimized inference with Ascend and Hugging Face☆12Apr 23, 2024Updated last year
- [EMNLP 2022] RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees☆11Jul 15, 2023Updated 2 years ago
- Code for "HiChunk: Evaluating and Enhancing Retrieval-Augmented Generation with Hierarchical Chunking"☆89Nov 18, 2025Updated 3 months ago
- Responsible Robotic Manipulation☆16Aug 31, 2025Updated 6 months ago
- A local search system implementation using Elasticsearch for Wikipedia data indexing and retrieval.☆12May 17, 2025Updated 9 months ago
- [ACL 2024] Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation☆10May 26, 2024Updated last year
- "Towards Improving Document Understanding: An Exploration on Text-Grounding via MLLMs" 2023☆16Nov 28, 2024Updated last year
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated 11 months ago
- The repository of the ACCV 2024 paper "FG-CXR: A Radiologist-Aligned Gaze Dataset for Enhancing Interpretability in Chest X-Ray Report Ge…☆11Jul 28, 2025Updated 7 months ago
- Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)☆28Dec 8, 2023Updated 2 years ago
- [EMNLP 2023] Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts☆27Nov 4, 2023Updated 2 years ago
- ☆109Jul 15, 2025Updated 7 months ago
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆31Aug 15, 2024Updated last year
- Radiology Language Evaluations☆11Nov 17, 2023Updated 2 years ago
- A framework to train language models to learn invariant representations.☆14Jan 24, 2022Updated 4 years ago
- Distributional Generalization in NLP. A roadmap.☆88Dec 12, 2022Updated 3 years ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆183Jul 23, 2025Updated 7 months ago
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆61Mar 31, 2025Updated 11 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- This is the code repo for the paper <UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction>☆15Aug 10, 2023Updated 2 years ago
- ☆32Aug 26, 2025Updated 6 months ago
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- ☆15Jul 9, 2025Updated 7 months ago
- [ICLR 2025] Code&Data for the paper "Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization"☆14Jun 21, 2024Updated last year
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Feb 29, 2024Updated 2 years ago
- [ICLR 2026] Efficient Agent Training for Computer Use☆138Sep 5, 2025Updated 5 months ago