microsoft / Lightweight-Low-Resource-NMT
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
☆17Updated last year
Related projects ⓘ
Alternatives and complementary repositories for Lightweight-Low-Resource-NMT
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This datase…☆34Updated last year
- ☆84Updated last year
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆27Updated last year
- Fault-aware neural code rankers☆25Updated last year
- ☆14Updated last year
- UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning☆60Updated last year
- A self-supervised learning approach based on extremely large masking☆29Updated last year
- Experiments for "Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision"☆13Updated last year
- ☆22Updated last year
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆51Updated last year
- Diffusion-based markup-to-image generation☆78Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Code for Zero-Shot Tokenizer Transfer☆115Updated 2 weeks ago
- Gallery for Industry AI demos☆17Updated last year
- Pipeline for pulling and processing online language model pretraining data from the web☆174Updated last year
- The official repo of our research work "Interactive Editing for Text Summarization".☆22Updated last year
- LL3M: Large Language and Multi-Modal Model in Jax☆64Updated 6 months ago
- Efficiently computing & storing token n-grams from large corpora☆15Updated last month
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆76Updated 11 months ago
- ☆26Updated last year
- Code used for the creation of OBELICS, an open, massive and curated collection of interleaved image-text web documents, containing 141M d…☆186Updated 2 months ago
- ☆149Updated 10 months ago
- ☆20Updated last year
- M4 experiment logbook☆56Updated last year
- Code and data from the paper 'Human Feedback is not Gold Standard'☆18Updated 4 months ago
- RL algorithm: Advantage induced policy alignment☆62Updated last year
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆57Updated 2 years ago
- Multimodal language model benchmark, featuring challenging examples☆148Updated 2 months ago