ejmejm / multilingual-nmt-mt5
An example of multilingual machine translation using a pretrained version of mt5 from Hugging Face.
☆42Updated 3 years ago
Alternatives and similar repositories for multilingual-nmt-mt5:
Users that are interested in multilingual-nmt-mt5 are comparing it to the libraries listed below
- MAFAND-MT☆55Updated 8 months ago
- Abstractive and Extractive Text summarization using Transformers.☆82Updated last year
- A collection of preprocessed datasets and pretrained models for generating paraphrases.☆29Updated 3 years ago
- Common crawl pretrained sentencepiece tokenizers for English and Japanese for various vocabulary sizes. Also development environment for …☆10Updated 3 years ago
- Crosslingual Question Answering for African Languages☆29Updated 6 months ago
- Comparing M2M and mT5 on a rare language pairs, blog post: https://medium.com/@abdessalemboukil/comparing-facebooks-m2m-to-mt5-in-low-re…☆15Updated 3 years ago
- ☆109Updated last year
- A Multilingual Replicable Instruction-Following Model☆93Updated last year
- Consists of the largest (10K) human annotated code-switched semantic parsing dataset & 170K generated utterance using the CST5 augmentati…☆37Updated 2 years ago
- Hinglish Text Classification☆30Updated last year
- cLang-8 is a dataset for grammatical error correction.☆103Updated 2 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- An instruction-based benchmark for text improvements.☆141Updated 2 years ago
- A python package to augment text data using NLP.☆40Updated last month
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆46Updated 2 years ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- Text Summarization for Research Papers☆74Updated 2 years ago
- A web application that interfaces two GEC systems. [web instance is down]☆31Updated 7 months ago
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆118Updated 6 months ago
- A pipeline for transliteration, spell correction, POS tagging and word sense disambiguation of Hinglish code mixed data to Hindi Devanaga…☆35Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆87Updated last year
- A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.☆78Updated 3 years ago
- Question-answer generation from text☆71Updated last year
- Some notebooks for NLP☆198Updated last year
- A repository of example implementations for interesting ml concepts☆28Updated 2 years ago
- A multi-purpose toolkit for table-to-text generation: web interface, Python bindings, CLI commands.☆55Updated 10 months ago
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆72Updated 2 years ago
- ☆32Updated 2 years ago
- This dataset contains synthetic training data for grammatical error correction. The corpus is generated by corrupting clean sentences fro…☆160Updated 6 months ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆102Updated 2 years ago