microsoft / Lightweight-Low-Resource-NMT
Official code for "Too Brittle To Touch: Comparing the Stability of Quantization and Distillation Towards Developing Lightweight Low-Resource MT Models" to appear in WMT 2022.
☆17Updated last year
Alternatives and similar repositories for Lightweight-Low-Resource-NMT
Users that are interested in Lightweight-Low-Resource-NMT are comparing it to the libraries listed below
Sorting:
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- We release the UICaption dataset. The dataset consists of UI images (icons and screenshots) and associated text descriptions. This datase…☆39Updated 2 years ago
- Fault-aware neural code rankers☆28Updated 2 years ago
- UNISUMM: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning☆60Updated last year
- ☆84Updated last year
- ☆14Updated last year
- ☆22Updated last year
- DeFacto - Demonstrations and Feedback for improving factual consistency of text summarization☆29Updated 2 years ago
- ☆65Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Repo for "Smart Word Suggestions" (SWS) task and benchmark☆20Updated last year
- NTREX -- News Test References for MT Evaluation☆83Updated 11 months ago
- ☆90Updated 5 months ago
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆33Updated 2 years ago
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆68Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- Tools for managing datasets for governance and training.☆85Updated 3 months ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆108Updated 2 months ago
- Tutorial to pretrain & fine-tune a 🤗 Flax T5 model on a TPUv3-8 with GCP☆58Updated 2 years ago
- ☆44Updated 6 months ago
- [NAACL 2024] Official repository for "KTRL+F: Knowledge-Augmented In-Document Search"☆23Updated 7 months ago
- code for paper "Accessing higher dimensions for unsupervised word translation"☆21Updated last year
- ☆97Updated 2 years ago
- This repository contains the dataset and the pytorch implementations of the models from the paper CIDER: Commonsense Inference for Dialog…☆27Updated 2 years ago
- We introduce MKQA, an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically …☆178Updated 2 years ago
- ☆38Updated 9 months ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆26Updated 2 years ago
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 10 months ago
- ☆33Updated 2 years ago
- Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://a…☆46Updated 2 years ago