google-research-datasets / Hinglish-TOP-Dataset

Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentation technique. Queries are derived from TOPv2, a multi-domain task oriented semantic parsing dataset. Tests suggest that with CST5, up to 20x less labeled data can achieve the same semantic parsing performance.
β˜†37Updated 2 years ago

Alternatives and similar repositories for Hinglish-TOP-Dataset:

Users that are interested in Hinglish-TOP-Dataset are comparing it to the libraries listed below