YFHuangxxxx / CBBQ
☆23Updated last year
Related projects ⓘ
Alternatives and complementary repositories for CBBQ
- This repo is for the paper: On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark☆24Updated 2 years ago
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"☆57Updated 2 years ago
- EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration☆32Updated 8 months ago
- ☆10Updated last year
- LLMDet is a text detection tool that can identify which generated sources the text came from (e.g. large language model or human-write).☆50Updated 5 months ago
- OMGEval😮: An Open Multilingual Generative Evaluation Benchmark for Foundation Models☆32Updated 3 months ago
- ☆22Updated last year
- The source code of paper "CHEF: A Pilot Chinese Dataset for Evidence-Based Fact-Checking"☆68Updated last year
- Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [EMNLP 2023 Findings]☆21Updated 11 months ago
- ☆36Updated 10 months ago
- Source code and dataset for EMNLP 2022 paper "MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subev…☆78Updated last year
- ☆59Updated last year
- The information of NLP PhD application in the world.☆35Updated 2 months ago
- Do Large Language Models Know What They Don’t Know?☆85Updated this week
- Official Code for "PPT: Pre-trained Prompt Tuning for Few-shot Learning". ACL 2022☆108Updated 2 years ago
- Official Implementation of "Probing Language Models for Pre-training Data Detection"☆16Updated 5 months ago
- ☆46Updated 4 months ago
- ☆63Updated 5 months ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆109Updated 5 months ago
- ☆15Updated 9 months ago
- ☆47Updated 2 months ago
- Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"☆15Updated last year
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"☆77Updated last year
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)☆44Updated 6 months ago
- 🩺 A collection of ChatGPT evaluation reports on various bechmarks.☆48Updated last year
- ☆42Updated 11 months ago
- Code for "FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models (ACL 2024)"☆86Updated last week
- Collection of papers for scalable automated alignment.☆71Updated 2 weeks ago
- Paper list of "The Life Cycle of Knowledge in Big Language Models: A Survey"☆61Updated last year