Softcatala / julibert
Catalan bert model
☆12Updated 4 years ago
Alternatives and similar repositories for julibert:
Users that are interested in julibert are comparing it to the libraries listed below
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern stri…☆22Updated 2 years ago
- SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages☆8Updated last year
- ☆45Updated 6 months ago
- LOW-RESOURCE NEURAL MACHINE TRANSLATION: A BENCHMARK FOR FIVE AFRICAN LANGUAGES☆15Updated 4 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentation☆24Updated last year
- Compiled tools, datasets, and other resources for historical text normalization.☆17Updated 5 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆14Updated 8 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated last year
- Gamma Agreement in Python☆43Updated 11 months ago
- Deepspeech ASR Model for the Catalan Language☆17Updated 4 years ago
- ☆42Updated 3 years ago
- Complimentary code for our paper Automatic punctuation restoration with BERT models☆48Updated last year
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆70Updated 9 months ago
- ☆22Updated 2 years ago
- This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The tex…☆51Updated 4 years ago
- Morfessor EM+Prune☆10Updated 4 years ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars☆15Updated 8 months ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.☆72Updated last year
- phone inventory library☆16Updated last year
- ☆15Updated 11 months ago
- A survey of corpora for Germanic low-resource languages and dialects☆24Updated 2 months ago
- Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…☆17Updated 2 years ago
- Morphological Inflection for Low-Resource Languages using cross-lingual transfer☆20Updated 5 years ago
- The central repo for Creole based NLU and NLG work☆17Updated 8 months ago
- A character-level BERT for Ancient Greek☆10Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)☆40Updated last year
- Python wrapper for phonetisaurus grapheme to phoneme tool☆12Updated 3 years ago
- Compound splitter for German☆104Updated 4 years ago
- A python true casing utility that restores case information for texts☆88Updated 2 years ago
- A guide to building language technology in new languages.☆58Updated 3 years ago