google-research-datasets / Hinglish-TOP-Dataset

Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentation technique. Queries are derived from TOPv2, a multi-domain task oriented semantic parsing dataset. Tests suggest that with CST5, up to 20x less labeled data can achieve the same semantic parsing performance.
35Updated last year

Alternatives and similar repositories for Hinglish-TOP-Dataset:

Users that are interested in Hinglish-TOP-Dataset are comparing it to the libraries listed below