cl-tohoku / PheMT

A phenomenon-wise evaluation dataset for Japanese-English machine translation robustness. The dataset is based on the MTNT dataset, with additional annotations of four linguistic phenomena; Proper Noun, Abbreviated Noun, Colloquial Expression, and Variant. COLING 2020.
14Updated 3 years ago

Related projects

Alternatives and complementary repositories for PheMT