Softcatala / julibert
Catalan bert model
β12Updated 4 years ago
Alternatives and similar repositories for julibert
Users that are interested in julibert are comparing it to the libraries listed below
Sorting:
- Gamma Agreement in Pythonβ44Updated last year
- π Resource and Tool for Writing System Identification -- LREC 2024β14Updated 11 months ago
- Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern striβ¦β25Updated 2 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentationβ26Updated 2 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β75Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β45Updated 2 years ago
- β42Updated 3 years ago
- SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languagesβ9Updated last year
- Deepspeech ASR Model for the Catalan Languageβ17Updated 4 years ago
- β47Updated 9 months ago
- Bicleaner fork that uses neural networksβ40Updated this week
- Compound splitter for Germanβ105Updated 5 years ago
- Easier Automatic Sentence Simplification Evaluationβ159Updated last year
- The Benchmark of Linguistic Minimal Pairsβ150Updated 2 years ago
- Morphological Inflection for Low-Resource Languages using cross-lingual transferβ20Updated 5 years ago
- The central repo for Creole based NLU and NLG workβ18Updated 2 weeks ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.β31Updated 2 months ago
- Python Finite-State Toolkitβ54Updated last week
- A guide to building language technology in new languages.β58Updated 3 years ago
- β23Updated 3 years ago
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.β25Updated last year
- Morfessor EM+Pruneβ10Updated 4 years ago
- Transform TMX to textβ28Updated 2 years ago
- β23Updated 5 years ago
- A tool that locates, downloads, and extracts machine translation corporaβ154Updated 3 weeks ago
- This is a german text corpus from Wikipedia. It is cleaned, preprocessed and sentence splitted. It's purpose is to train NLP embeddings lβ¦β24Updated 3 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.β142Updated 5 months ago
- β19Updated 3 years ago
- Tool to fix bitexts and tag near-duplicates for removalβ30Updated 3 months ago
- OpusFilter - Parallel corpus processing toolkitβ104Updated last month