lukemelas / mtob
☆28Updated 7 months ago
Alternatives and similar repositories for mtob:
Users that are interested in mtob are comparing it to the libraries listed below
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 2 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆32Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆40Updated last year
- Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding☆18Updated 2 years ago
- ☆13Updated 7 months ago
- ☆81Updated last year
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆30Updated last year
- The geometry of multilingual language model representations (EMNLP 2022).☆16Updated 2 years ago
- ☆20Updated 4 years ago
- ☆23Updated last month
- NAACL 2024: SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning☆24Updated last month
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆99Updated 9 months ago
- ☆21Updated 2 years ago
- ☆58Updated 2 years ago
- Code for the paper "Simulating Bandit Learning from User Feedback for Extractive Question Answering".☆18Updated 2 years ago
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑☆14Updated 9 months ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆70Updated 10 months ago
- ☆25Updated 2 years ago
- Code repository for the paper "Mission: Impossible Language Models."☆42Updated 2 weeks ago
- Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)☆24Updated 3 years ago
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560☆58Updated last year
- ☆48Updated last year
- DEMix Layers for Modular Language Modeling☆53Updated 3 years ago
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆22Updated 2 months ago
- M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer☆55Updated 2 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆19Updated 5 months ago
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆77Updated 9 months ago
- EMNLP2022 "Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment"☆17Updated last year
- ☆11Updated 2 years ago
- Introduction and scripts for ACL-2020 paper "On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation"☆21Updated 4 years ago