IamAdiSri / hf-trim
Reduce the size of pretrained Hugging Face models via vocabulary trimming.
☆44Updated 2 years ago
Alternatives and similar repositories for hf-trim
Users that are interested in hf-trim are comparing it to the libraries listed below
Sorting:
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆21Updated last year
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆75Updated last year
- NTREX -- News Test References for MT Evaluation☆83Updated 11 months ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆100Updated last year
- Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021☆61Updated 4 years ago
- ☆92Updated last year
- Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."☆139Updated 2 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 3 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆34Updated last year
- This repo supports various cross-lingual transfer learning & multilingual NLP models.☆92Updated last year
- A repository with the code related to experiments around context-aware machine translation☆50Updated 2 years ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Updated 3 weeks ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆58Updated 4 years ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆107Updated last month
- Repository containing the open source code of works published at the FBK MT unit.☆43Updated 3 weeks ago
- MT Evaluation in Many Languages via Zero-Shot Paraphrasing☆101Updated 9 months ago
- An official implementation of "BPE-Dropout: Simple and Effective Subword Regularization" algorithm.☆51Updated 4 years ago
- The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models☆24Updated 3 years ago
- A library of translation-based text similarity measures☆25Updated last year
- Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022☆14Updated 2 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆31Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- ☆36Updated 2 years ago
- a tool for calcualting character n-gram F score☆72Updated 2 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆71Updated last year
- m4Adapter: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter (Findings of EMNLP 2022)☆18Updated 2 years ago
- How to finetune mbart using fairseq☆24Updated 4 years ago
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆80Updated 8 months ago
- Fairseq tutorial☆17Updated 2 years ago
- MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization☆73Updated 3 years ago