π Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
β11Apr 6, 2025Updated 11 months ago
Alternatives and similar repositories for MEXA
Users that are interested in MEXA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- πΈ GlotWeb: Web Indexing for Minority Languages (WWW 2026)β17Feb 27, 2026Updated 3 weeks ago
- πΈ GlotCC Dataset and Pipline -- NeurIPS 2024β20Apr 6, 2025Updated 11 months ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingβ18Nov 26, 2023Updated 2 years ago
- β15Mar 8, 2024Updated 2 years ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?β11Apr 18, 2023Updated 2 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networksβ12Nov 9, 2021Updated 4 years ago
- π Resource and Tool for Writing System Identification (Unicode 17.0) -- LREC 2024β21Feb 17, 2026Updated last month
- Curriculum trainingβ22Jun 25, 2025Updated 8 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific wayβ18Nov 4, 2025Updated 4 months ago
- The geometry of multilingual language model representations (EMNLP 2022).β22Oct 21, 2022Updated 3 years ago
- PyTorch source code of NAACL 2021 paper "Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Tranβ¦β18Oct 18, 2022Updated 3 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023β106Apr 20, 2024Updated last year
- A python library for easily querying morphological inflection models trained on Unimorphβ13Oct 23, 2022Updated 3 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialectsβ23Jan 26, 2025Updated last year
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper heβ¦β27Aug 8, 2025Updated 7 months ago
- Code for the paper "Greed is All You Need: An Evaluation of Tokenizer Inference Methods"β13Nov 26, 2024Updated last year
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)β16Jul 27, 2024Updated last year
- Minimal code to train ELMo models in recent versions of TensorFlowβ14Apr 30, 2023Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"β13Dec 14, 2021Updated 4 years ago
- Code base for the EMNLP 2021 Findings paper: Cartography Active Learningβ14Jun 3, 2025Updated 9 months ago
- A extension of Transformers library to include T5ForSequenceClassification class.β40Apr 17, 2023Updated 2 years ago
- β21Dec 30, 2022Updated 3 years ago
- Repository of PIXAR, a Pixel-based Auto-Regressive Language Modelβ18Sep 15, 2025Updated 6 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pβ¦β35Aug 15, 2023Updated 2 years ago
- [NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.β13Jun 18, 2025Updated 9 months ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resourceβ¦β27Feb 16, 2026Updated last month
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"β26Jun 3, 2025Updated 9 months ago
- Official code release for "SuperBPE: Space Travel for Language Models"β90Jan 9, 2026Updated 2 months ago
- Code for paper βLanguage Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Abilityββ15Jun 13, 2023Updated 2 years ago
- Match your fig size and font to conference formats.β11Aug 16, 2021Updated 4 years ago
- β14Jan 4, 2021Updated 5 years ago
- Python source code for EMNLP 2020 paper "Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT".β35Mar 16, 2022Updated 4 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/β¦β28Apr 17, 2024Updated last year
- β44Feb 11, 2026Updated last month
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learningβ30Jan 25, 2023Updated 3 years ago
- β10Oct 17, 2021Updated 4 years ago
- EEG-MI signal classification DL model.β14Apr 26, 2024Updated last year
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IWβ¦β18Nov 30, 2022Updated 3 years ago
- β37Nov 14, 2025Updated 4 months ago