tylerachang / multilingual-geometry
The geometry of multilingual language model representations (EMNLP 2022).
☆15Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for multilingual-geometry
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper he…☆21Updated this week
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆31Updated 2 years ago
- Data and code accompanying the paper "As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive…☆21Updated last year
- PyTorch source code of NAACL 2021 paper "Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Tran…☆17Updated 2 years ago
- Code and data accompanying our ACL 2020 paper, "Unsupervised Domain Clusters in Pretrained Language Models".☆59Updated 4 years ago
- ☆42Updated last year
- ☆25Updated 2 years ago
- FRANK: Factuality Evaluation Benchmark☆52Updated last year
- Faithfulness and factuality annotations of XSum summaries from our paper "On Faithfulness and Factuality in Abstractive Summarization" (h…☆81Updated 3 years ago
- UDapter is a multilingual dependency parser that uses "contextual" adapters together with language-typology features for language-specifi…☆30Updated last year
- Multilingual Dialogue Datasets☆18Updated 2 years ago
- ☆48Updated last year
- ☆21Updated 2 years ago
- ☆14Updated 3 years ago
- ☆57Updated 2 years ago
- Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.☆90Updated last month
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.☆20Updated last year
- A repository with the code related to experiments around context-aware machine translation☆48Updated 2 years ago
- Detect hallucinated tokens for conditional sequence generation.☆63Updated 2 years ago
- ☆25Updated 2 years ago
- ☆11Updated 2 years ago
- ☆22Updated 7 months ago
- ☆44Updated 3 years ago
- ☆17Updated 9 months ago
- Codebase, data and models for the SummaC paper in TACL☆85Updated 10 months ago
- ☆20Updated 3 years ago
- Pretraining scripts for BART transformer model☆11Updated last year
- This repository contains the dataset and code for "WiCE: Real-World Entailment for Claims in Wikipedia" in EMNLP 2023.☆39Updated 10 months ago
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Updated 3 years ago
- ☆50Updated 2 years ago