adapter-hub/hgiyt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/adapter-hub/hgiyt)

adapter-hub / hgiyt

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

☆28

Alternatives and similar repositories for hgiyt

Users that are interested in hgiyt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

spyysalo / wiki-bert-pipeline
View on GitHub
Generate BERT vocabularies and pretraining examples from Wikipedias
☆17May 11, 2020Updated 6 years ago
nateraw / spaces-docker-templates
View on GitHub
🚀🤗 A collection of templates for Hugging Face Spaces
☆35Oct 9, 2023Updated 2 years ago
juditacs / snippets
View on GitHub
Python snippets
☆21Mar 10, 2020Updated 6 years ago
cambridgeltl / adversarial-postspec
View on GitHub
Auxiliary GAN for WE post-specialisation
☆24Feb 22, 2019Updated 7 years ago
petezh / OpenD5
View on GitHub
Tasks for describing differences between text distributions.
☆17Aug 9, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
jeongukjae / KR-BERT-SimCSE
View on GitHub
Implementing SimCSE using KR-BERT
☆31Jul 23, 2021Updated 5 years ago
alexa / ramen
View on GitHub
A software for transferring pre-trained English models to foreign languages
☆20Mar 20, 2023Updated 3 years ago
cindyxinyiwang / expand-via-lexicon-based-adaptation
View on GitHub
Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"
☆29Apr 2, 2022Updated 4 years ago
adapter-hub / xGQA
View on GitHub
☆25Mar 4, 2022Updated 4 years ago
nika2312 / qa_explaination
View on GitHub
☆13Jul 8, 2020Updated 6 years ago
Genius1237 / TyDiP
View on GitHub
TyDiP Multilingual Politeness dataset and code
☆12Oct 15, 2023Updated 2 years ago
NewsEye / NLP-Notebooks-Newspaper-Collections
View on GitHub
A collection of notebooks for Natural Language Processing
☆25Jan 13, 2025Updated last year
hipe-eval / HIPE-scorer
View on GitHub
A python module for evaluating NERC and NEL system performances as defined in the HIPE shared tasks (formerly CLEF-HIPE-2020-scorer).
☆17Jun 4, 2024Updated 2 years ago
AkariAsai / XORQA
View on GitHub
This is the official repository for NAACL 2021, "XOR QA: Cross-lingual Open-Retrieval Question Answering".
☆80Jun 3, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
neulab / InterpretEval
View on GitHub
Interpretable Evaluation for (Almost) All NLP Tasks
☆194Sep 22, 2025Updated 10 months ago
UniversalNER / UniversalNER
View on GitHub
☆28Apr 19, 2026Updated 3 months ago
kensho-technologies / pathpiece
View on GitHub
PathPiece tokenizer
☆14Nov 10, 2024Updated last year
lingjzhu / spoken_sent_embedding
View on GitHub
Unsupervised spoken sentence embeddings
☆14Dec 14, 2022Updated 3 years ago
phosseini / GisPy
View on GitHub
GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/
☆13Jul 1, 2024Updated 2 years ago
jouniluoma / bert-ner-cmv
View on GitHub
☆13Dec 17, 2021Updated 4 years ago
kb-labb / kb_bart
View on GitHub
Pretraining scripts for BART transformer model
☆12May 15, 2023Updated 3 years ago
pdufter / staticlama
View on GitHub
☆13Apr 16, 2021Updated 5 years ago
mayhewsw / pytorch-truecaser
View on GitHub
A simple neural truecaser written in pytorch and allennlp.
☆35Jun 17, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
uds-lsv / TOKEN-is-a-MASK
View on GitHub
Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"
☆14Aug 19, 2022Updated 3 years ago
monologg / EncT5
View on GitHub
Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks
☆62Jan 22, 2022Updated 4 years ago
jiphyeonjeon / season3
View on GitHub
Jiphyeonjeon Season 3
☆40Apr 18, 2022Updated 4 years ago
gucci-j / light-transformer-emnlp2021
View on GitHub
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
☆34Nov 21, 2021Updated 4 years ago
princeton-nlp / ShortcutGrammar
View on GitHub
EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560
☆59Feb 28, 2025Updated last year
frankxu2004 / knnlm-why
View on GitHub
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆59Jan 12, 2023Updated 3 years ago
AI21Labs / pmi-masking
View on GitHub
This repository includes the masking vocabulary used in the ICLR 2021 spotlight PMI-Masking paper
☆14Aug 9, 2021Updated 4 years ago
cisnlp / MEXA
View on GitHub
[ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
☆11Apr 6, 2025Updated last year
smallbenchnlp / ELECTRA-DeBERTa
View on GitHub
☆16Dec 14, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ByronHsu / FlyteGPT
View on GitHub
🦅🔗 Building FlyteGPT on Flyte with LangChain
☆30Jan 23, 2024Updated 2 years ago
stefan-it / german-gpt2
View on GitHub
German GPT-2 model
☆32Aug 17, 2021Updated 4 years ago
facebookresearch / irt-leaderboard
View on GitHub
Leaderboards are widely used in NLP and push the field forward. While leaderboards are a straightforward ranking of NLP models, this simp…
☆18Mar 30, 2022Updated 4 years ago
microsoft / Multilingual-Evaluation-of-Generative-AI-MEGA
View on GitHub
Code for Multilingual Eval of Generative AI paper published at EMNLP 2023
☆72Mar 6, 2024Updated 2 years ago
adapter-hub / Hub
View on GitHub
ARCHIVED. Please use https://docs.adapterhub.ml/huggingface_hub.html || 🔌 A central repository collecting pre-trained adapter modules
☆69May 26, 2024Updated 2 years ago
lm-pub-quiz / lm-pub-quiz
View on GitHub
Evaluate language models using multiple choice items
☆13Mar 6, 2026Updated 4 months ago
mnamysl / nat-acl2020
View on GitHub
☆15May 26, 2021Updated 5 years ago