[EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
☆36Jun 7, 2025Updated 8 months ago
Alternatives and similar repositories for focus
Users that are interested in focus are comparing it to the libraries listed below
Sorting:
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆88Sep 12, 2024Updated last year
- A library for language transfer methods and algorithms.☆16Feb 6, 2026Updated last month
- ☆15Jun 14, 2024Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆18Nov 26, 2023Updated 2 years ago
- Code for Zero-Shot Tokenizer Transfer☆143Jan 14, 2025Updated last year
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Experiments for XLM-V Transformers Integeration☆13Feb 8, 2023Updated 3 years ago
- Goldfish: Monolingual language models for 350 languages.☆23Updated this week
- Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting ir…☆63Oct 25, 2024Updated last year
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Apr 20, 2024Updated last year
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆34Jan 18, 2025Updated last year
- Repository for the "Understanding and Mitigating Language Confusion in LLMs" paper☆29Jun 28, 2024Updated last year
- Seed Machine Translation Data☆33Nov 12, 2024Updated last year
- ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost☆42Nov 15, 2023Updated 2 years ago
- Research code for pixel-based encoders of language (PIXEL)☆346Jul 15, 2025Updated 7 months ago
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆42Apr 29, 2023Updated 2 years ago
- Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested …☆10Dec 27, 2021Updated 4 years ago
- ☆10May 19, 2024Updated last year
- A CardDAV to IP phones converter for Node.js (AVM FRITZ!Box, Snom XCAP, Yealink)☆14Sep 30, 2025Updated 5 months ago
- COMET for African languages☆10Jan 24, 2025Updated last year
- An opinionated NLP research template☆10Aug 29, 2024Updated last year
- Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binari…☆15Aug 25, 2017Updated 8 years ago
- This is code for the EMNLP 2022 Paper "UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation".☆10Apr 30, 2023Updated 2 years ago
- maps are everything.☆10Jul 3, 2025Updated 8 months ago
- Code for the paper "Fishing for Magikarp"☆181Feb 25, 2026Updated last week
- CycleQD is a framework for parameter space model merging.☆48Feb 1, 2025Updated last year
- Ukranian NER annotation project☆92Apr 23, 2025Updated 10 months ago
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆95Aug 15, 2023Updated 2 years ago
- ☆10Oct 6, 2021Updated 4 years ago
- A repo to keep all resources about interpretability in NLP organised and up to date☆12Nov 22, 2020Updated 5 years ago
- Emoji-cheat-sheet converter for Python☆10Dec 29, 2014Updated 11 years ago
- [COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…☆17Oct 4, 2025Updated 5 months ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- [ICLR 2025 SynthData Workshop Spotlight] Empowering LLMs in Decision Games through Algorithmic Data Synthesis☆26Apr 27, 2025Updated 10 months ago
- CAR-bench☆21Feb 23, 2026Updated last week
- Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification☆10May 31, 2022Updated 3 years ago
- Reference implementation of models from Nyonic Model Factory☆12May 13, 2024Updated last year
- ☆11Mar 15, 2024Updated last year