zouharvi/subset2evaluate

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zouharvi/subset2evaluate)

zouharvi / subset2evaluate

Find informative examples to efficiently (human)-evaluate NLG models.

☆17

Alternatives and similar repositories for subset2evaluate

Users that are interested in subset2evaluate are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Babelscape / WSL
View on GitHub
Word Sense Linking model is designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory.
☆13Aug 23, 2024Updated last year
SapienzaNLP / zebra
View on GitHub
☆15Dec 26, 2024Updated last year
dayeonki / mt_feedback
View on GitHub
Code for "Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations" [NAACL Findings 2024]
☆14Apr 3, 2026Updated 3 months ago
Principled-Intelligence / orbitals
View on GitHub
☆17Updated this week
Babelscape / FENICE
View on GitHub
FENICE (Factuality Evaluation of Summarization based on Natural Language Inference and Claim Extraction) is a factuality-oriented metric …
☆30Nov 29, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
masakhane-io / africomet
View on GitHub
COMET for African languages
☆11Jan 24, 2025Updated last year
yoavgur / PISCES
View on GitHub
🪝PISCES - Precise In-Parameter Suppression for Concept EraSure in Large Language Models
☆13Jun 28, 2026Updated 3 weeks ago
google-research / mt-metrics-eval
View on GitHub
Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.
☆132Apr 23, 2026Updated 2 months ago
xiye17 / EvalQAExpl
View on GitHub
Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.
☆17Apr 25, 2021Updated 5 years ago
Coldmist-Lu / MQM_APE
View on GitHub
[MQM-APE] Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators.
☆12Sep 24, 2024Updated last year
SALT-NLP / CoAnnotating
View on GitHub
This is the official repository for "CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data An…
☆24Oct 26, 2023Updated 2 years ago
visinf / fast-axiomatic-attribution
View on GitHub
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
☆15Feb 24, 2026Updated 4 months ago
mt-upc / transformer-contributions
View on GitHub
Measuring the Mixing of Contextual Information in the Transformer
☆35May 27, 2023Updated 3 years ago
mt-upc / transformer-contributions-nmt
View on GitHub
☆18Oct 6, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Yao-Dou / LENS
View on GitHub
☆25May 11, 2024Updated 2 years ago
technion-cs-nlp / parametric-faithfulness
View on GitHub
☆23Aug 30, 2025Updated 10 months ago
samiraabnar / Bridge
View on GitHub
Making a bridge between NLP models and Brain data
☆19Jun 3, 2020Updated 6 years ago
Elbria / xformal-FoST
View on GitHub
Code and data for the NAACL 2021 paper: "XFORMAL: A Benchmark for Multilingual Formality Style Transfer"
☆12Jun 7, 2021Updated 5 years ago
Riccorl / ipa
View on GitHub
NLP Preprocessing Pipeline Wrappers
☆11May 12, 2023Updated 3 years ago
Smu-Tan / Remedy
View on GitHub
[EMNLP2025] Remedy: Learning Machine Translation Evaluation from Human Preferences with Reward Modeling
☆16Nov 20, 2025Updated 8 months ago
bglezseoane / gitcher
View on GitHub
The git profile switcher
☆15Apr 18, 2020Updated 6 years ago
mohsenfayyaz / DecompX
View on GitHub
DecompX: Explaining Transformers Decisions by Propagating Token Decomposition [ACL 2023]
☆19Jul 3, 2025Updated last year
mohsenfayyaz / GlobEnc
View on GitHub
[NAACL 2022] GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
☆21May 16, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Betswish / MIRAGE
View on GitHub
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
☆25Mar 10, 2025Updated last year
amazon-science / contrastive-controlled-mt
View on GitHub
Code and data for the IWSLT 2022 shared task on Formality Control for SLT
☆22May 24, 2023Updated 3 years ago
ufal / factgenie
View on GitHub
Lightweight self-hosted span annotation tool
☆44Apr 20, 2026Updated 3 months ago
AppraiseDev / OCELoT
View on GitHub
Project OCELoT: an Open, Collaborative Evaluation Leaderboard of Translations
☆23Jul 11, 2026Updated last week
GChrysostomou / ood_faith
View on GitHub
☆13Jul 26, 2023Updated 2 years ago
deep-spin / tower-eval
View on GitHub
☆29Nov 14, 2025Updated 8 months ago
SapienzaNLP / maverick-coref
View on GitHub
☆69Jun 10, 2025Updated last year
michelbl / MPPCA
View on GitHub
Mixtures of Probabilistic Principal Component Analysers implementation in python
☆31Feb 6, 2018Updated 8 years ago
ltgoslo / definition_modeling
View on GitHub
Interpretable Word Sense Representations via Definition Generation
☆10Mar 6, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
SapienzaNLP / gsrl
View on GitHub
GSRL is a seq2seq model for end-to-end dependency- and span-based SRL (IJCAI2021).
☆18Sep 14, 2021Updated 4 years ago
MadryLab / AT2
View on GitHub
Attribute statements generated by LLMs to preceding tokens using attention weights.
☆28Apr 22, 2025Updated last year
TransluceAI / circuits
View on GitHub
ADAG: Transluce's MLP neuron-level circuit tracing library
☆33Apr 10, 2026Updated 3 months ago
kernelmachine / demix-data
View on GitHub
Benchmark API for Multidomain Language Modeling
☆25Aug 26, 2022Updated 3 years ago
arianhosseini / negation-learning
View on GitHub
code for our paper "Understanding by Understanding Not: Modeling Negation in Language Models"
☆16Aug 15, 2022Updated 3 years ago
ZurichNLP / mbr
View on GitHub
Minimum Bayes Risk Decoding for Hugging Face Transformers
☆60Jun 3, 2024Updated 2 years ago
seinan9 / LSCDiscovery
View on GitHub
Scripts for large-scale prediction of lexical semantic change.
☆14Feb 9, 2023Updated 3 years ago