Tools and Modeling Code for the MASSIVE dataset
☆557Nov 28, 2022Updated 3 years ago
Alternatives and similar repositories for massive
Users that are interested in massive are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repository for SLURP paper☆109Apr 20, 2022Updated 3 years ago
- Data and code for the paper "End-to-End Slot Alignment and Recognition for Cross-Lingual NLU" (Accepted to EMNLP 2020)☆27Jan 13, 2022Updated 4 years ago
- XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 ty…☆652Jan 4, 2023Updated 3 years ago
- TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and …☆316May 28, 2020Updated 5 years ago
- The Schema-Guided Dialogue Dataset☆600Aug 7, 2023Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆97Aug 6, 2022Updated 3 years ago
- Dataset from the paper "Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering" (COLING 2022)☆118Oct 25, 2022Updated 3 years ago
- Adversarial Natural Language Inference Benchmark☆399May 12, 2022Updated 3 years ago
- ☆13Aug 23, 2024Updated last year
- ☆364Nov 15, 2024Updated last year
- Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"☆740Jan 11, 2024Updated 2 years ago
- DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue☆286Jul 6, 2023Updated 2 years ago
- Repo for external large-scale work☆6,542Apr 27, 2024Updated last year
- Open source code and data for AAAI 2022 Oral Paper "Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding"☆35May 26, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆54Nov 21, 2022Updated 3 years ago
- Please see the readme file as well as our 2019 EMNLP paper linked here -->☆221Apr 24, 2024Updated last year
- Facebook AI Research Sequence-to-Sequence Toolkit written in Python.☆32,200Sep 30, 2025Updated 6 months ago
- Code to reproduce experiments in the paper "Task-Oriented Dialogue as Dataflow Synthesis" (TACL 2020).☆309Apr 30, 2024Updated last year
- A dataset containing human-human knowledge-grounded open-domain conversations.☆669Aug 2, 2024Updated last year
- Zero-shot dialogue state tracking (DST)☆83Nov 18, 2021Updated 4 years ago
- Large datasets for conversational AI☆1,390Nov 16, 2019Updated 6 years ago
- Multilingual Compositional Wikidata Questions (MCWQ)☆20Jun 12, 2023Updated 2 years ago
- Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines …☆154Jul 19, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Multi-angle c(q)uestion answering☆458Aug 22, 2022Updated 3 years ago
- Repository that accompanies "An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction" (EMNLP 2019)☆220Jun 1, 2021Updated 4 years ago
- Pre-Trained Models for ToD-BERT☆294Jul 17, 2023Updated 2 years ago
- ☆133Jul 5, 2023Updated 2 years ago
- New dataset☆311Aug 31, 2021Updated 4 years ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resource…☆27Feb 16, 2026Updated last month
- Efficient few-shot learning with Sentence Transformers☆2,703Dec 11, 2025Updated 3 months ago
- The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, who…☆20Dec 7, 2022Updated 3 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆511Sep 23, 2020Updated 5 years ago
- data collator for UL2 and U-PaLM☆29Aug 20, 2023Updated 2 years ago
- Language Models as Few-Shot Learner for Task-Oriented Dialogue Systems☆22May 28, 2021Updated 4 years ago
- The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic …☆3,642Updated this week
- Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."☆143Nov 1, 2022Updated 3 years ago
- ☆75Jul 2, 2021Updated 4 years ago
- Data & Code for ACCENTOR: "Adding Chit-Chat to Enhance Task-Oriented Dialogues" (NAACL 2021)☆72Oct 12, 2021Updated 4 years ago