This respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity"
☆81Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for chatgpt-evaluation
Users that are interested in chatgpt-evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".☆17May 24, 2022Updated 3 years ago
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022☆13Apr 13, 2022Updated 3 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- CAiRE in DialDoc21: Data Augmentation for Information-SeekingDialogue System☆11May 24, 2022Updated 3 years ago
- Versatile Generative Language Model☆25Oct 29, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆15Dec 10, 2021Updated 4 years ago
- ☆38Aug 20, 2024Updated last year
- Unified MultiWOZ evaluation scripts for the context-to-response task.☆59Oct 11, 2023Updated 2 years ago
- ☆14Aug 21, 2025Updated 7 months ago
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆23Feb 13, 2023Updated 3 years ago
- Can ChatGPT really understand the opinions, sentiments, and emotions contained in the text? We provide a preliminary evaluation.☆54Sep 23, 2024Updated last year
- [ACL 2023] The code for our ACL'23 paper Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Pr…☆24Jun 1, 2024Updated last year
- A collection of instruction data and scripts for machine translation.☆20Sep 23, 2023Updated 2 years ago
- ☆25Dec 13, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The official repository for Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapte…☆17Jan 15, 2024Updated 2 years ago
- ☆10Jun 16, 2021Updated 4 years ago
- Interpretable unified language safety checking with large language models☆32Apr 15, 2023Updated 2 years ago
- ☆10Sep 27, 2021Updated 4 years ago
- ☆15Oct 20, 2023Updated 2 years ago
- Detect hallucinated tokens for conditional sequence generation.☆64Apr 15, 2022Updated 3 years ago
- The code for paper "ProQA: Structural Prompt-based Pre-training for Unified Question Answering"☆11Feb 7, 2023Updated 3 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆30Mar 5, 2024Updated 2 years ago
- Dataset and model in the paper "SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation"☆13Feb 14, 2022Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆21Oct 22, 2021Updated 4 years ago
- Trials of pre-trained BERT models for the medical domain in Japanese.☆12Nov 21, 2020Updated 5 years ago
- Code for generating Quasimodo, a commonsense knowledge base.☆20Sep 14, 2021Updated 4 years ago
- 🎁[ChatGPT4NLU] A Comparative Study on ChatGPT and Fine-tuned BERT☆192Apr 17, 2023Updated 2 years ago
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆53Jan 21, 2024Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆82Apr 11, 2024Updated last year
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated 3 weeks ago
- ☆19Jul 22, 2025Updated 8 months ago
- ☆21Oct 10, 2025Updated 5 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation…☆26Sep 10, 2025Updated 6 months ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆29Mar 3, 2023Updated 3 years ago
- Resource, Evaluation and Detection Papers for ChatGPT☆455Mar 21, 2024Updated 2 years ago
- ☆31Apr 14, 2023Updated 2 years ago
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)☆53Jun 12, 2022Updated 3 years ago
- Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…☆11Oct 16, 2022Updated 3 years ago