microsoft / Lightweight-Low-Resource-NMT
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
☆17Updated last year
Alternatives and similar repositories for Lightweight-Low-Resource-NMT:
Users that are interested in Lightweight-Low-Resource-NMT are comparing it to the libraries listed below
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This datase…☆37Updated 2 years ago
- ☆14Updated last year
- Fault-aware neural code rankers☆27Updated 2 years ago
- UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning☆60Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- ☆84Updated last year
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆27Updated 2 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated last year
- ☆22Updated last year
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆20Updated last year
- **ARCHIVED** Filesystem interface to 🤗 Hub☆57Updated last year
- Experiments for "Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision"☆13Updated last year
- ☆96Updated 2 years ago
- ☆16Updated 2 months ago
- The official repo of our research work "Interactive Editing for Text Summarization".☆22Updated last year
- ☆36Updated 5 months ago
- Finite-state script normalization and processing utilities☆38Updated this week
- Code for Zero-Shot Tokenizer Transfer☆119Updated this week
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆20Updated 2 years ago
- Truly flash T5 realization!☆60Updated 7 months ago
- Evaluation pipeline for the BabyLM Challenge 2023.☆73Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated last year
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆80Updated last year
- ☆48Updated 11 months ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆67Updated 10 months ago
- Self-Conditioning Pre-Trained Language Models, ICML 2022☆30Updated 2 years ago
- This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalenc…☆53Updated 5 months ago