This respository contains the code for extracting the test samples we used in our paper: "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity"
☆80Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for chatgpt-evaluation
Users that are interested in chatgpt-evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".☆17May 24, 2022Updated 4 years ago
- Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022☆13Apr 13, 2022Updated 4 years ago
- 用Paddle复现论文ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information(ACL2021)☆10Nov 15, 2021Updated 4 years ago
- Towards Few-Shot Fact-Checking via Perplexity☆13Jun 11, 2021Updated 4 years ago
- CAiRE in DialDoc21: Data Augmentation for Information-SeekingDialogue System☆11May 24, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Versatile Generative Language Model☆25Oct 29, 2022Updated 3 years ago
- ☆15Dec 10, 2021Updated 4 years ago
- ☆38Aug 20, 2024Updated last year
- Unified MultiWOZ evaluation scripts for the context-to-response task.☆59Oct 11, 2023Updated 2 years ago
- ☆14Aug 21, 2025Updated 9 months ago
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆23Feb 13, 2023Updated 3 years ago
- Mutual Information Predicts Hallucinations in Abstractive Summarization☆13Nov 14, 2022Updated 3 years ago
- Can ChatGPT really understand the opinions, sentiments, and emotions contained in the text? We provide a preliminary evaluation.☆54Sep 23, 2024Updated last year
- [ACL 2023] The code for our ACL'23 paper Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Pr…☆24Jun 1, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆25Dec 13, 2024Updated last year
- A collection of instruction data and scripts for machine translation.☆20Sep 23, 2023Updated 2 years ago
- The official repository for Multi3WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapte…☆17Jan 15, 2024Updated 2 years ago
- ☆10Jun 16, 2021Updated 4 years ago
- Interpretable unified language safety checking with large language models☆32Apr 15, 2023Updated 3 years ago
- ☆10Sep 27, 2021Updated 4 years ago
- ☆15Oct 20, 2023Updated 2 years ago
- Detect hallucinated tokens for conditional sequence generation.☆64Apr 15, 2022Updated 4 years ago
- Pre-processing text and tokenization for UTH-BERT☆10Sep 30, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The code for paper "ProQA: Structural Prompt-based Pre-training for Unified Question Answering"☆11Feb 7, 2023Updated 3 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆31Mar 5, 2024Updated 2 years ago
- Dataset and model in the paper "SciXGen: A Scientific Paper Dataset for Context-Aware Text Generation"☆13Feb 14, 2022Updated 4 years ago
- ☆21Oct 22, 2021Updated 4 years ago
- [COLING2022] A Multi-turn Machine Reading Comprehension Framework with Rethink Mechanism for Emotion-Cause Pair Extraction☆18Oct 13, 2022Updated 3 years ago
- 🎁[ChatGPT4NLU] A Comparative Study on ChatGPT and Fine-tuned BERT☆191Apr 17, 2023Updated 3 years ago
- [ACL 2023] Code and Data Repo for Paper "Element-aware Summary and Summary Chain-of-Thought (SumCoT)"☆53Jan 21, 2024Updated 2 years ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆81Apr 11, 2024Updated 2 years ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆19Jul 22, 2025Updated 10 months ago
- The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation…☆26Sep 10, 2025Updated 8 months ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆29Mar 3, 2023Updated 3 years ago
- Resource, Evaluation and Detection Papers for ChatGPT☆455Mar 21, 2024Updated 2 years ago
- ☆31Apr 14, 2023Updated 3 years ago
- OFA-Compress is a unified framework which provides OFA model finetuning, distillation and inference capabilities in Huggingface version, …☆29Sep 22, 2022Updated 3 years ago
- The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)☆52Jun 12, 2022Updated 3 years ago