fdschmidt93 / trident-nllb-llm2vec
Repository for "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"
☆13Updated last month
Related projects ⓘ
Alternatives and complementary repositories for trident-nllb-llm2vec
- GlotCC Dataset and Pipline -- NeurIPS 2024☆16Updated 3 weeks ago
- Official implementation of "GPT or BERT: why not both?"☆36Updated last week
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆23Updated 7 months ago
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆19Updated 2 years ago
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆31Updated 3 weeks ago
- A tiny BERT for low-resource monolingual models☆29Updated last month
- Library for pruning experts per language pair in NLLB-200☆30Updated last year
- LTG-Bert☆29Updated 10 months ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆29Updated last year
- BigKnow2022: Bringing Language Models Up to Speed☆14Updated last year
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 5 months ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- ☆19Updated last year
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki☆22Updated this week
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆13Updated 9 months ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- ☆20Updated last year
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…☆12Updated last year
- A library for data streaming and augmentation☆20Updated 8 months ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Updated last year
- Suite for phonetic word embeddings, especially their evaluation and baseline models.☆24Updated 3 weeks ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆75Updated 2 months ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆39Updated 10 months ago
- ☆15Updated last year
- ☆32Updated last year
- ☆14Updated last month
- The official repository for Toxic Commons and Celadon. Toxicity Classification for public domain data.☆9Updated last week
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆57Updated 2 years ago
- ☆16Updated last year
- Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"☆28Updated 2 years ago