xu1998hz/InstructScore_SEScore3

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xu1998hz/InstructScore_SEScore3)

xu1998hz / InstructScore_SEScore3

First explanation metric (diagnostic report) for text generation evaluation

☆62

Alternatives and similar repositories for InstructScore_SEScore3

Users that are interested in InstructScore_SEScore3 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xu1998hz / SEScore
View on GitHub
This repo contains all the codes for SEScore implementation
☆15Mar 3, 2025Updated last year
xu1998hz / SEScore2
View on GitHub
☆17Mar 3, 2025Updated last year
wenhuchen / WikiTables-WithLinks
View on GitHub
Crawled Wikipedia Tables with Passages
☆14Aug 19, 2021Updated 4 years ago
zkx06111 / ALGO
View on GitHub
☆36May 25, 2023Updated 3 years ago
maszhongming / UniEval
View on GitHub
Repository for EMNLP 2022 Paper: Towards a Unified Multi-Dimensional Evaluator for Text Generation
☆217Feb 10, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
XinyuanLu00 / SciTab
View on GitHub
The project page for "SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables"
☆23Dec 21, 2023Updated 2 years ago
google-research / mt-metrics-eval
View on GitHub
Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.
☆132Apr 23, 2026Updated 3 months ago
MicrosoftTranslator / GEMBA
View on GitHub
GEMBA — GPT Estimation Metric Based Assessment
☆152Dec 15, 2025Updated 7 months ago
thu-coai / OpenMEVA
View on GitHub
Benchmark for evaluating open-ended generation
☆50Nov 6, 2024Updated last year
TIGER-AI-Lab / TIGERScore
View on GitHub
"TIGERScore: Towards Building Explainable Metric for All Text Generation Tasks" [TMLR 2024]
☆32Dec 21, 2024Updated last year
XinyuanLu00 / QACheck
View on GitHub
About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"
☆19Dec 19, 2023Updated 2 years ago
BinWang28 / FacEval
View on GitHub
EMNLP 2022: Analyzing and Evaluating Faithfulness in Dialogue Summarization
☆13Mar 20, 2025Updated last year
teacherpeterpan / ProgramFC
View on GitHub
Codes for ACL 2023 Paper "Fact-Checking Complex Claims with Program-Guided Reasoning"
☆32Jun 2, 2023Updated 3 years ago
EleanorJiang / BlonDe
View on GitHub
Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …
☆85Sep 21, 2023Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
exe1023 / DialEvalMetrics
View on GitHub
☆62Oct 30, 2022Updated 3 years ago
benpry / chain-of-thought-metaphor
View on GitHub
This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…
☆14Apr 28, 2023Updated 3 years ago
Shikib / fed
View on GitHub
Code for SIGdial 2020 paper: Unsupervised Evaluation of Interactive Dialog with DialoGPT (https://arxiv.org/abs/2006.12719)
☆28Jun 8, 2020Updated 6 years ago
fe1ixxu / CPO_SIMPO
View on GitHub
This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.
☆59Aug 13, 2024Updated last year
i-Eval / FairEval
View on GitHub
☆145Sep 10, 2023Updated 2 years ago
CPF-NLPR / IncrementalED
View on GitHub
☆18Oct 19, 2020Updated 5 years ago
adiSimhi / Interpreting-Embedding-Spaces-by-Conceptualization
View on GitHub
☆15Oct 17, 2023Updated 2 years ago
ylsung / vl-merging
View on GitHub
PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"
☆37Oct 11, 2023Updated 2 years ago
dqwang122 / LSSAMP
View on GitHub
Code for paper 'Accelerating Antimicrobial Peptide Discovery with Latent Sequence-Structure Model'
☆13Mar 21, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
umd-huang-lab / Mementos
View on GitHub
☆32Feb 8, 2024Updated 2 years ago
VinAIResearch / tise-toolbox
View on GitHub
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation (ECCV 2022)
☆34Nov 12, 2024Updated last year
swarnaHub / ExplaGraphs
View on GitHub
[EMNLP 2021] Dataset and PyTorch Code for ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning
☆14Nov 5, 2022Updated 3 years ago
michaelsaxon / CoCoCroLa
View on GitHub
The Conceptual Coverage Across Languages Benchmark for Text-to-Image Models
☆12Oct 28, 2024Updated last year
matt-seb-ho / WikiWhy
View on GitHub
WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000…
☆49Dec 7, 2023Updated 2 years ago
XuandongZhao / pf-decoding
View on GitHub
[ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs
☆19Mar 20, 2025Updated last year
THU-KEG / EvaluationPapers4ChatGPT
View on GitHub
Resource, Evaluation and Detection Papers for ChatGPT
☆456Mar 21, 2024Updated 2 years ago
pacman100 / accelerate-deepspeed-test
View on GitHub
Testing DeepSpeed integration in 🤗 Accelerate
☆11Jun 28, 2022Updated 4 years ago
e0397123 / dstc10_metric_track
View on GitHub
The Official Repository for the Automatic Dialogue Evaluation Sub-task of DSTC10 Track 5 (Automatic Evaluation and Moderation of Open-dom…
☆19Nov 1, 2021Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
Unbabel / COMET
View on GitHub
A Neural Framework for MT Evaluation
☆770Apr 21, 2026Updated 3 months ago
nyu-mll / ILF-for-code-generation
View on GitHub
☆81Mar 24, 2025Updated last year
GAIR-NLP / MetaCritique
View on GitHub
Evaluate the Quality of Critique
☆37Jun 1, 2024Updated 2 years ago
mukhal / intrinsic-source-citation
View on GitHub
[COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models
☆19Apr 1, 2025Updated last year
Marker-Inc-Korea / CoT-llama2
View on GitHub
Chain-of-thought 방식을 활용하여 llama2를 fine-tuning
☆10Nov 18, 2023Updated 2 years ago
TIGER-AI-Lab / StructLM
View on GitHub
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆76Oct 19, 2024Updated last year
thunlp / ConvDR
View on GitHub
Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"
☆43Dec 9, 2021Updated 4 years ago