guijinSON/MM-Eval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/guijinSON/MM-Eval)

guijinSON / MM-Eval

Official implementation for "MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models"

☆20

Alternatives and similar repositories for MM-Eval

Users that are interested in MM-Eval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Cohere-Labs-Community / m-rewardbench
View on GitHub
Evaluating Reward Models in Multilingual Settings (ACL Main '25)
☆42May 16, 2025Updated last year
joeljang / FLM
View on GitHub
All-in-one repository for Fine-tuning & Pretraining (Large) Language Models
☆15Mar 8, 2023Updated 3 years ago
daniel-furman / polyglot-or-not
View on GitHub
Are foundation LMs multilingual knowledge bases? (EMNLP 2023)
☆18Dec 8, 2023Updated 2 years ago
kaistAI / KtrlF
View on GitHub
[NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"
☆23Oct 11, 2024Updated last year
violet-zct / fairseq-dro-mnmt
View on GitHub
☆14Sep 10, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
kaistAI / GAP
View on GitHub
[ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization
☆29Sep 12, 2024Updated last year
EleutherAI / hae-rae
View on GitHub
☆33Aug 30, 2023Updated 2 years ago
jkc-ai / mwp_kr_data
View on GitHub
☆13Jan 12, 2023Updated 3 years ago
eXascaleInfolab / MARTA
View on GitHub
☆18Feb 22, 2023Updated 3 years ago
Xinyi2016 / FInstruct
View on GitHub
☆14May 8, 2023Updated 3 years ago
prometheus-eval / cmu-paper-reviewer
View on GitHub
Code repository for the "CMU Paper Reviewer System", a agentic system that generates reviews for academic papers.
☆25Jun 9, 2026Updated last month
The-FinAI / The-FinData
View on GitHub
the benchmark for finance
☆11Jul 4, 2023Updated 3 years ago
MiuLab / FactAlign
View on GitHub
Source code of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"
☆19Oct 3, 2024Updated last year
jkc-ai / mwp-korean-data-2021
View on GitHub
자연어 처리 기반 [한글 서술형 수학문제 데이터셋] 공개 저장소입니다.
☆14Jun 12, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
floatai / HumanEval-XL
View on GitHub
[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
☆42Mar 7, 2025Updated last year
kimyuji / EvolvingQA_benchmark
View on GitHub
Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)
☆10Oct 16, 2024Updated last year
TIGER-AI-Lab / MAmmoTH2
View on GitHub
Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]
☆146Oct 27, 2024Updated last year
wudapeng268 / KBQA-Baseline
View on GitHub
☆11Sep 27, 2022Updated 3 years ago
convei-lab / BotsTalk
View on GitHub
🤖 Code for our EMNLP 2022 paper: "BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Dataset…
☆16Oct 7, 2024Updated last year
soyoung97 / ListT5
View on GitHub
official repository for ListT5
☆51Nov 27, 2025Updated 8 months ago
zeyuyun1 / TransformerVis
View on GitHub
☆43Nov 16, 2021Updated 4 years ago
rbawden / mt-bigscience
View on GitHub
Evaluation results for Machine Translation within the BigScience project
☆11May 15, 2023Updated 3 years ago
aviaefrat / lmentry
View on GitHub
☆15Nov 22, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
soheeyang / unified-prompt-selection
View on GitHub
[TACL 2024] Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis
☆11Nov 14, 2024Updated last year
facebookresearch / evaluation-of-nmt-bt
View on GitHub
This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …
☆15Aug 31, 2021Updated 4 years ago
NahianHasan / Cardiovascular_Disease_Classification_Employing_EMD
View on GitHub
Cardiovascular Disease Classification Employing Empirical Mode Decomposition (EMD) of Modified ECG
☆12Oct 6, 2023Updated 2 years ago
kaistAI / How-Well-Do-LLMs-Truly-Ground
View on GitHub
☆11Sep 19, 2025Updated 10 months ago
cimm-kzn / RuDReC
View on GitHub
Russian Drug Reaction Corpus (RuDReC)
☆13Dec 29, 2020Updated 5 years ago
Leukas / CUTE
View on GitHub
☆20Apr 26, 2026Updated 3 months ago
jwieting / bilingual-generative-transformer
View on GitHub
Code for "A Bilingual Generative Transformer for Semantic Sentence Embedding" published at EMNLP 2020.
☆10Nov 20, 2020Updated 5 years ago
helliun / causal-chains
View on GitHub
Library for creating causal chains using language models.
☆82Feb 9, 2023Updated 3 years ago
GaiYu0 / QDGAT
View on GitHub
Question-Directed Graph Attention Network for Numerical Reasoning over Text
☆10Aug 14, 2020Updated 5 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ari-holtzman / newformer
View on GitHub
☆16Jul 20, 2023Updated 3 years ago
PythonNut / superbpe
View on GitHub
Official code release for "SuperBPE: Space Travel for Language Models"
☆97May 28, 2026Updated last month
meta-metrics / metametrics
View on GitHub
Accepted to ICLR 2025. MetaMetrics is a calibrated meta-metric designed to evaluate generation tasks across different modalities aligned …
☆15Dec 30, 2024Updated last year
shadowkiller33 / Language_attack
View on GitHub
A repo for LLM jailbreak
☆14Sep 5, 2023Updated 2 years ago
BenedictusAryo / OpenVino_face-detection_python
View on GitHub
Tutorial of Face Detection using OpenVino python
☆11Nov 23, 2020Updated 5 years ago
zhengzx-nlp / past-and-future-nmt
View on GitHub
Implementation of "Modeling Past and Future for Neural Machine Translation"
☆15Mar 16, 2018Updated 8 years ago
monoclear-ai / monoclear.ai
View on GitHub
한국어 LLM 리더보드 및 모델 성능/안전성 관리
☆22Sep 26, 2023Updated 2 years ago