Source Code of Paper "GPTScore: Evaluate as You Desire"
☆259Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for GPTScore
Users that are interested in GPTScore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"☆421Feb 4, 2024Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆43Mar 8, 2023Updated 3 years ago
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆218Feb 10, 2024Updated 2 years ago
- ☆40Jun 7, 2023Updated 2 years ago
- A benchmark dataset for evaluating dialog system and natural language generation metrics.☆39Jun 13, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆218Dec 24, 2023Updated 2 years ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆438Apr 13, 2025Updated last year
- ☆62Oct 30, 2022Updated 3 years ago
- BARTScore: Evaluating Generated Text as Text Generation☆369Jun 27, 2022Updated 3 years ago
- Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.☆34Jul 25, 2023Updated 2 years ago
- Code for the ICLR 2019 paper "Learning to Represent Edits"☆13Dec 8, 2022Updated 3 years ago
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆613Jun 26, 2024Updated last year
- ☆144Sep 10, 2023Updated 2 years ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆163Mar 11, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Codebase, data and models for the SummaC paper in TACL☆109Jan 30, 2025Updated last year
- Codes for our paper "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation" (ACL 2022)☆33Jun 6, 2022Updated 3 years ago
- Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"☆14Jul 22, 2025Updated 10 months ago
- The git repository of Modular Prompted Chatbot paper☆35May 24, 2023Updated 3 years ago
- ☆11Apr 13, 2023Updated 3 years ago
- EMNLP'2022: BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation☆41Oct 19, 2022Updated 3 years ago
- ☆772Jun 13, 2024Updated last year
- Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"☆19Oct 9, 2023Updated 2 years ago
- Emotion-Aware Dialogue Response Generation by Multi-Task Learning☆13Jan 22, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Complexity Based Prompting for Multi-Step Reasoning☆17Mar 10, 2023Updated 3 years ago
- BERT score for text generation☆1,899Jul 30, 2024Updated last year
- This repository contains a collection of papers and resources on Reasoning in Large Language Models.☆569Nov 13, 2023Updated 2 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆31Mar 5, 2024Updated 2 years ago
- Resource, Evaluation and Detection Papers for ChatGPT☆455Mar 21, 2024Updated 2 years ago
- Paper collections of retrieval-based (augmented) language model.☆232May 24, 2024Updated 2 years ago
- ☆284Jan 6, 2025Updated last year
- Code for the ACL2022 paper "Synthetic Question Value Estimation for Domain Adaptation of Question Answering"☆17Mar 21, 2022Updated 4 years ago
- 🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"☆13Mar 26, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆134Oct 25, 2023Updated 2 years ago
- ☆34Mar 25, 2023Updated 3 years ago
- Data and Code Release for "On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries"☆55Nov 9, 2020Updated 5 years ago
- Interpretable unified language safety checking with large language models☆32Apr 15, 2023Updated 3 years ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆138Jul 8, 2024Updated last year
- Question Answering and Generation for Summarization☆71Nov 27, 2022Updated 3 years ago
- ☆23Feb 26, 2024Updated 2 years ago