konstantinjdobler / focusView external linksLinks
[EMNLP'23] Official Code for "FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models"
☆36Jun 7, 2025Updated 8 months ago
Alternatives and similar repositories for focus
Users that are interested in focus are comparing it to the libraries listed below
Sorting:
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆87Sep 12, 2024Updated last year
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretraining☆18Nov 26, 2023Updated 2 years ago
- Code for Zero-Shot Tokenizer Transfer☆142Jan 14, 2025Updated last year
- [Konvens21] This repository contains the DFKI MobIE Corpus, a dataset of 3,232 German-language documents that have been annotated with fi…☆12Sep 17, 2024Updated last year
- Experiments for XLM-V Transformers Integeration☆13Feb 8, 2023Updated 3 years ago
- Goldfish: Monolingual language models for 350 languages.☆23Aug 25, 2024Updated last year
- [ACL 2024] LangBridge: Multilingual Reasoning Without Multilingual Supervision☆95Oct 30, 2024Updated last year
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Apr 20, 2024Updated last year
- Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)☆34Jan 18, 2025Updated last year
- Seed Machine Translation Data☆33Nov 12, 2024Updated last year
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- Research code for pixel-based encoders of language (PIXEL)☆346Jul 15, 2025Updated 6 months ago
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆41Apr 29, 2023Updated 2 years ago
- ☆12May 26, 2022Updated 3 years ago
- ☆10May 19, 2024Updated last year
- Sequential Parameter Optimization in Python☆14Jan 12, 2026Updated last month
- COMET for African languages☆10Jan 24, 2025Updated last year
- CAR-bench☆18Feb 6, 2026Updated last week
- CycleQD is a framework for parameter space model merging.☆48Feb 1, 2025Updated last year
- MultilingualSIFT: Multilingual Supervised Instruction Fine-tuning☆95Aug 15, 2023Updated 2 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 10 months ago
- OCCA Python API: JIT Compilation for Multiple Architectures☆11Dec 20, 2019Updated 6 years ago
- Implementation of a network for Handwriting Synthesis based on the work of Generating Sequences With Recurrent Neural Networks by Alex Gr…☆11May 12, 2025Updated 9 months ago
- [COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…☆17Oct 4, 2025Updated 4 months ago
- AutoRAG example about benchmarking Korean embeddings.☆43Oct 2, 2024Updated last year
- BERT score for text generation☆12Jan 15, 2025Updated last year
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…☆13Jan 16, 2024Updated 2 years ago
- Volume Rendering Sample project using C++, Qt5, OpenGL, OpenCL☆10Jan 6, 2022Updated 4 years ago
- ☆12Dec 4, 2020Updated 5 years ago
- ☆13Oct 3, 2024Updated last year
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Apr 4, 2024Updated last year
- OCaml PPX extension for automatically generating Irmin types☆11Jan 14, 2020Updated 6 years ago
- suffix array construction and searching algorithms for in-memory binary data.☆12Sep 10, 2022Updated 3 years ago
- Less-wrong single-file Numba-accelerated Python implementation of Gotoh affine gap penalty extensions for the Needleman–Wunsch, Smith-Wat…☆12Oct 30, 2025Updated 3 months ago
- A proofreading tool using Google's N-gram corpus.☆12Sep 2, 2022Updated 3 years ago
- A python library for easily querying morphological inflection models trained on Unimorph☆13Oct 23, 2022Updated 3 years ago