marzenakrp / demetrView external linksLinks
Repository for DEMETR: Diagnosing Evaluation Metrics for Translation
☆17Nov 29, 2022Updated 3 years ago
Alternatives and similar repositories for demetr
Users that are interested in demetr are comparing it to the libraries listed below
Sorting:
- ☆29Dec 2, 2024Updated last year
- Official codebase accompanying our ACL 2022 paper "RELiC: Retrieving Evidence for Literary Claims" (https://relic.cs.umass.edu).☆20May 14, 2022Updated 3 years ago
- ☆12Sep 1, 2021Updated 4 years ago
- Neural Fuzzy Repair (NFR) is a data augmentation pipeline, which integrates fuzzy matches (i.e. similar translations) into neural machine…☆12Aug 14, 2024Updated last year
- Code & data for EMNLP 2020 paper "MOCHA: A Dataset for Training and Evaluating Reading Comprehension Metrics".☆16May 3, 2022Updated 3 years ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆126Oct 13, 2025Updated 4 months ago
- To help search, filter, and download papers from 'acl anthology' (https://aclanthology.org/).☆18Sep 12, 2024Updated last year
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆22May 24, 2023Updated 2 years ago
- ☆21Dec 8, 2022Updated 3 years ago
- Code and data for the paper "Disentangling Uncertainty in Machine Translation Evaluation", accepted at EMNLP 2022.☆23Jun 23, 2023Updated 2 years ago
- ☆24Apr 2, 2024Updated last year
- [ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPT☆92Oct 14, 2025Updated 4 months ago
- Alternative implementation of the coreference scorer for the CoNLL-2011/2012 shared tasks on coreference resolution☆11Apr 29, 2021Updated 4 years ago
- human_detectors hosts the data released from the paper "People who frequently use ChatGPT for writing tasks are accurate and robust detec…☆44May 9, 2025Updated 9 months ago
- ☆75Jul 2, 2021Updated 4 years ago
- ☆14Apr 29, 2025Updated 9 months ago
- some useless python stuff☆11Jul 30, 2020Updated 5 years ago
- Multilingual Quality Estimation and Automatic Post-editing Dataset☆42Mar 24, 2022Updated 3 years ago
- A framework for evaluating Machine Translation models.☆12May 26, 2025Updated 8 months ago
- EMNLP DiscoEval paper☆43Nov 12, 2019Updated 6 years ago
- Official repository for our EACL 2023 paper "LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization" (https…☆44Aug 10, 2024Updated last year
- ☆11Jun 5, 2024Updated last year
- ☆11Nov 10, 2015Updated 10 years ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- Code for the ICLR'24 paper: MT-RANKER : Reference-free machine translation evaluation by inter-system ranking☆10Feb 29, 2024Updated last year
- ☆10Mar 19, 2024Updated last year
- Github repository for "Internalizing World Models via Self-Play Finetuning for Agentic RL"☆33Nov 1, 2025Updated 3 months ago
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆12Feb 10, 2025Updated last year
- Code for paper "Reasoning Like an Economist: Post-Training on Economic Problems Induces Strategic Generalization in LLMs"☆12Jun 11, 2025Updated 8 months ago
- Fair paper matching☆11Jan 20, 2020Updated 6 years ago
- Library for experimenting with state-of-the-art evaluation metrics like UScore☆12May 27, 2023Updated 2 years ago
- Reasoning-based Evaluation and Ranking of Translations.☆19Jul 18, 2025Updated 6 months ago
- ☆22Updated this week
- Owl Eyes: Spotting UI Display Issues via Visual Understanding☆11Jul 31, 2020Updated 5 years ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆46Jul 30, 2022Updated 3 years ago
- 原神四星角色全配队综述及基于gcsim的DPS计算(施工中)☆16Oct 19, 2024Updated last year
- Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"☆14Nov 22, 2024Updated last year
- Multi-Figurative Language Generation (COLING 2022)☆12Jan 30, 2023Updated 3 years ago
- Code for building and experimenting on saliency maps for RL agents.☆12Feb 13, 2020Updated 6 years ago