Source Code of Paper "GPTScore: Evaluate as You Desire"
☆258Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for GPTScore
Users that are interested in GPTScore are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"☆424Feb 4, 2024Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆42Mar 8, 2023Updated 3 years ago
- Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation☆218Feb 10, 2024Updated 2 years ago
- ☆40Jun 7, 2023Updated 3 years ago
- A benchmark dataset for evaluating dialog system and natural language generation metrics.☆39Jun 13, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆218Dec 24, 2023Updated 2 years ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic…☆440Apr 13, 2025Updated last year
- ☆62Oct 30, 2022Updated 3 years ago
- BARTScore: Evaluating Generated Text as Text Generation☆369Jun 27, 2022Updated 3 years ago
- Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.☆34Jul 25, 2023Updated 2 years ago
- Code for the ICLR 2019 paper "Learning to Represent Edits"☆13Dec 8, 2022Updated 3 years ago
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models☆620Jun 26, 2024Updated last year
- ☆145Sep 10, 2023Updated 2 years ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.☆164Mar 11, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Codebase, data and models for the SummaC paper in TACL☆109Jan 30, 2025Updated last year
- Codes for our paper "CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation" (ACL 2022)☆33Jun 6, 2022Updated 4 years ago
- Resources for paper "DialSummEval: Revisiting summarization evaluation for dialogues"☆14Jul 22, 2025Updated 10 months ago
- The git repository of Modular Prompted Chatbot paper☆35May 24, 2023Updated 3 years ago
- ☆11Apr 13, 2023Updated 3 years ago
- EMNLP'2022: BERTScore is Unfair: On Social Bias in Language Model-Based Metrics for Text Generation☆41Oct 19, 2022Updated 3 years ago
- ☆775Jun 13, 2024Updated 2 years ago
- Code for EMNLP 2023 findings paper "A Closer Look into Using Large Language Models for Automatic Evaluation"☆19Oct 9, 2023Updated 2 years ago
- Emotion-Aware Dialogue Response Generation by Multi-Task Learning☆13Jan 22, 2022Updated 4 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Complexity Based Prompting for Multi-Step Reasoning☆17Mar 10, 2023Updated 3 years ago
- BERT score for text generation☆1,901Jul 30, 2024Updated last year
- This repository contains a collection of papers and resources on Reasoning in Large Language Models.☆571Nov 13, 2023Updated 2 years ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆31Mar 5, 2024Updated 2 years ago
- Resource, Evaluation and Detection Papers for ChatGPT☆455Mar 21, 2024Updated 2 years ago
- Paper collections of retrieval-based (augmented) language model.☆232May 24, 2024Updated 2 years ago
- ☆284Jan 6, 2025Updated last year
- Code for the ACL2022 paper "Synthetic Question Value Estimation for Domain Adaptation of Question Answering"☆18Mar 21, 2022Updated 4 years ago
- 🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"☆13Mar 26, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆34Mar 25, 2023Updated 3 years ago
- Data and Code Release for "On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries"☆55Nov 9, 2020Updated 5 years ago
- Interpretable unified language safety checking with large language models☆32Apr 15, 2023Updated 3 years ago
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆138Jul 8, 2024Updated last year
- Question Answering and Generation for Summarization☆72Nov 27, 2022Updated 3 years ago
- ☆23Feb 26, 2024Updated 2 years ago
- This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"☆26Mar 30, 2023Updated 3 years ago