Multilingual Open Text
☆25May 8, 2025Updated 9 months ago
Alternatives and similar repositories for mot
Users that are interested in mot are comparing it to the libraries listed below
Sorting:
- SeqScore: Scoring for named entity recognition and other sequence labeling tasks☆23Dec 16, 2025Updated 2 months ago
- Literary Language Toolkit: code, models, corpora, and web tools☆11Mar 28, 2024Updated last year
- German Parliamentary Corpus (GerParCor)☆29Jan 14, 2026Updated last month
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Aug 19, 2022Updated 3 years ago
- ☆12Mar 20, 2020Updated 5 years ago
- MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models. (EMNLP 2024 Findings)☆14Oct 3, 2024Updated last year
- Explicit Alignment Objectives for Multilingual Bidirectional Encoders☆14Apr 14, 2021Updated 4 years ago
- ☆14Feb 3, 2021Updated 5 years ago
- Named Entity Recognition☆19Feb 13, 2026Updated 2 weeks ago
- Layout Analysis Dataset with Segmonto (LADaS)☆24Jul 12, 2025Updated 7 months ago
- Make the Best of Cross-lingual Transfer: Evidence from POS Tagging with over 100 Languages (ACL 2022)☆19May 17, 2022Updated 3 years ago
- Implementation of the paper 'Sentence Bottleneck Autoencoders from Transformer Language Models'☆17Mar 14, 2022Updated 3 years ago
- Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models☆16Sep 13, 2021Updated 4 years ago
- ☆17Jan 12, 2023Updated 3 years ago
- CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Switching☆18Mar 29, 2021Updated 4 years ago
- Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"☆20Nov 12, 2021Updated 4 years ago
- ☆25Jan 22, 2024Updated 2 years ago
- Source stories from the African Storybook Project in Markdown format☆22Jan 25, 2026Updated last month
- ☆21Oct 19, 2020Updated 5 years ago
- ☆24Jun 12, 2023Updated 2 years ago
- Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning☆30Jan 25, 2023Updated 3 years ago
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- Tensorflow implementation of RankGan (Adversarial Ranking for Language Generation)☆22Jun 15, 2018Updated 7 years ago
- Repository collecting resources and best practices to improve experimental rigour in deep learning research.☆27Mar 30, 2023Updated 2 years ago
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Jun 20, 2023Updated 2 years ago
- Neural Language Models for Historical Research☆29Oct 16, 2024Updated last year
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆36Oct 14, 2025Updated 4 months ago
- A tiny BERT for low-resource monolingual models☆31Dec 24, 2025Updated 2 months ago
- ☆28Feb 24, 2025Updated last year
- Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".☆26Mar 10, 2025Updated 11 months ago
- Crosslingual Question Answering for African Languages☆30Sep 27, 2024Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆70Jan 27, 2023Updated 3 years ago
- ☆37Sep 22, 2021Updated 4 years ago
- ICU based universal language tokenizer☆34Jan 19, 2022Updated 4 years ago
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆41May 5, 2021Updated 4 years ago
- The implementation of "Neural Machine Translation without Embeddings", NAACL 2021☆33Jun 9, 2021Updated 4 years ago
- EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling☆34Nov 21, 2021Updated 4 years ago
- ParaNames: A multilingual resource for parallel names☆39May 20, 2024Updated last year
- Code for "Dynamic Contextualized Word Embeddings"☆32Dec 30, 2021Updated 4 years ago