We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that the OOD accuracy in NLP tasks needs to be paid more attention to since the significant performance decay compared to ID accuracy has been found in all settings.
☆93Aug 15, 2023Updated 2 years ago
Alternatives and similar repositories for GLUE-X
Users that are interested in GLUE-X are comparing it to the libraries listed below
Sorting:
- ☆68May 16, 2023Updated 2 years ago
- ASK-Attack and ASK-Defense☆42Oct 12, 2022Updated 3 years ago
- json转为typescript接口☆221Jan 5, 2023Updated 3 years ago
- Deep Reinforcement Learning Algorithms for solving Atari 2600 Games☆143Mar 23, 2023Updated 2 years ago