AI4Bharat / aacl23-mnmt-tutorial
Additional resources from our AACL tutorial
☆10Updated last year
Alternatives and similar repositories for aacl23-mnmt-tutorial:
Users that are interested in aacl23-mnmt-tutorial are comparing it to the libraries listed below
- Statistics on multilingual datasets☆17Updated 2 years ago
- CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Switching☆18Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 11 months ago
- This repositary hosts my experiments for the project, I did with OffNote Labs.☆10Updated 3 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 2 years ago
- ☆51Updated last year
- Repository for the English-Hindi Codemixed to Monolingual English Parallel Corpus☆13Updated 6 years ago
- This repository contains the HiNER dataset released with our paper at LREC 2022☆14Updated last year
- A tiny BERT for low-resource monolingual models☆31Updated 4 months ago
- ☆22Updated 2 years ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 4 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆72Updated last year
- a repository containing the details of natural language inference dataset in Hindi☆11Updated 4 years ago
- LTG-Bert☆29Updated last year
- Codebase for probing and visualizing multilingual models.☆47Updated 4 years ago
- NTREX -- News Test References for MT Evaluation☆81Updated 8 months ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- A benchmark for code-switched NLP, ACL 2020☆74Updated 8 months ago
- Code and Data for Evaluation WG☆41Updated 2 years ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- Code Repository for the IndicXNLI paper.☆14Updated last year
- Pre-trained, multilingual sequence-to-sequence models for Indian languages☆45Updated 2 years ago
- CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed dat…☆33Updated 4 years ago
- Code of NAACL 2022 "Efficient Hierarchical Domain Adaptation for Pretrained Language Models" paper.☆32Updated last year
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆77Updated 5 months ago
- ☆46Updated 2 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆51Updated 4 years ago
- Interactive Neural Machine Translation tool☆53Updated last year
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- Language-agnostic BERT Sentence Embedding (LaBSE) Pytorch Model☆21Updated 4 years ago