microsoft / Lightweight-Low-Resource-NMTLinks
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
☆17Updated last year
Alternatives and similar repositories for Lightweight-Low-Resource-NMT
Users that are interested in Lightweight-Low-Resource-NMT are comparing it to the libraries listed below
Sorting:
- We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This datase…☆40Updated 2 years ago
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- Fault-aware neural code rankers☆28Updated 2 years ago
- ☆22Updated 2 years ago
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆29Updated 2 years ago
- ☆84Updated last year
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 10 months ago
- ☆14Updated last year
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆31Updated 2 years ago
- UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning☆60Updated last year
- IndicGenBench is a high-quality, multilingual, multi-way parallel benchmark for evaluating Large Language Models (LLMs) on 4 user-facing …☆50Updated 9 months ago
- ☆78Updated last year
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆202Updated 9 months ago
- A diff tool for language models☆42Updated last year
- ☆97Updated 2 years ago
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆51Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆178Updated last year
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆181Updated 2 years ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆150Updated last year
- ☆149Updated last year
- We believe the ability of an LLM to attribute the text that it generates is likely to be crucial for both system developers and users in …☆54Updated last year
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆26Updated 2 years ago
- ☆65Updated last year
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆69Updated last year
- ☆32Updated 2 years ago
- A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB te…☆277Updated 4 months ago
- Code for Zero-Shot Tokenizer Transfer☆128Updated 4 months ago
- ☆211Updated 3 months ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- ☆11Updated 4 years ago