π Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
β11Apr 6, 2025Updated last year
Alternatives and similar repositories for MEXA
Users that are interested in MEXA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- πΈ GlotWeb: Web Indexing for Minority Languages (WWW 2026)β17Feb 27, 2026Updated last month
- πΈ GlotCC Dataset and Pipline -- NeurIPS 2024β20Apr 6, 2025Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingβ18Nov 26, 2023Updated 2 years ago
- β15Mar 8, 2024Updated 2 years ago
- Can LLMs generate code-mixed sentences through zero-shot prompting?β11Apr 18, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- KnowMAN: Weakly Supervised Multinomial Adversarial Networksβ12Nov 9, 2021Updated 4 years ago
- π Resource and Tool for Writing System Identification (Unicode 17.0) -- LREC 2024β21Mar 29, 2026Updated 2 weeks ago
- Curriculum trainingβ22Jun 25, 2025Updated 9 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific wayβ18Nov 4, 2025Updated 5 months ago
- The geometry of multilingual language model representations (EMNLP 2022).β22Oct 21, 2022Updated 3 years ago
- PyTorch source code of NAACL 2021 paper "Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Tranβ¦β18Oct 18, 2022Updated 3 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023β106Apr 20, 2024Updated last year
- A python library for easily querying morphological inflection models trained on Unimorphβ13Oct 23, 2022Updated 3 years ago
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialectsβ23Jan 26, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Easy-to-use framework for evaluating cross-lingual consistency of factual knowledge (Supported LLaMA, BLOOM, mT5, RoBERTa, etc.) Paper heβ¦β28Aug 8, 2025Updated 8 months ago
- Code for the paper "Greed is All You Need: An Evaluation of Tokenizer Inference Methods"β13Nov 26, 2024Updated last year
- SCT: An Efficient Self-Supervised Cross-View Training For Sentence Embedding (TACL)β16Jul 27, 2024Updated last year
- Minimal code to train ELMo models in recent versions of TensorFlowβ14Apr 30, 2023Updated 2 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"β13Dec 14, 2021Updated 4 years ago
- Code base for the EMNLP 2021 Findings paper: Cartography Active Learningβ14Jun 3, 2025Updated 10 months ago
- A extension of Transformers library to include T5ForSequenceClassification class.β40Apr 17, 2023Updated 2 years ago
- β21Dec 30, 2022Updated 3 years ago
- Repository of PIXAR, a Pixel-based Auto-Regressive Language Modelβ18Sep 15, 2025Updated 6 months ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning Pβ¦β34Aug 15, 2023Updated 2 years ago
- [NeurIPS 2023] Code base for the Renyi Kernel Entropy (RKE) metric for generative models.β13Jun 18, 2025Updated 9 months ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resourceβ¦β27Feb 16, 2026Updated last month
- Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"β26Jun 3, 2025Updated 10 months ago
- Official code release for "SuperBPE: Space Travel for Language Models"β90Jan 9, 2026Updated 3 months ago
- Code for paper βLanguage Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Abilityββ15Jun 13, 2023Updated 2 years ago
- Match your fig size and font to conference formats.β11Aug 16, 2021Updated 4 years ago
- Python source code for EMNLP 2020 paper "Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT".β35Mar 16, 2022Updated 4 years ago
- β14Jan 4, 2021Updated 5 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/β¦β29Apr 17, 2024Updated last year
- β44Feb 11, 2026Updated 2 months ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learningβ30Jan 25, 2023Updated 3 years ago
- β10Oct 17, 2021Updated 4 years ago
- EEG-MI signal classification DL model.β14Apr 26, 2024Updated last year
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IWβ¦β18Nov 30, 2022Updated 3 years ago
- β37Nov 14, 2025Updated 4 months ago