morrisalp / unikud
Hebrew nikud with transfomers
☆17Updated last month
Alternatives and similar repositories for unikud:
Users that are interested in unikud are comparing it to the libraries listed below
- Hebrew Diacritizer☆36Updated 3 weeks ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆12Updated last year
- An NLP pipeline for Hebrew☆37Updated 3 weeks ago
- AlephBertGimmel - Modern Hebrew pretrained BERT model with a 128K token vocabulary.☆23Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆51Updated 2 months ago
- ☆51Updated 3 years ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammars☆17Updated 9 months ago
- downloads and parses subtitle dataset from opensubtitles.org☆16Updated 11 months ago
- A tool for transliterating Hebrew☆41Updated last month
- Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLST…☆33Updated 7 months ago
- Overview of Icelandic NLP resources at a glance☆16Updated 9 months ago
- Unicode Standard tokenization routines and orthography profile segmentation☆35Updated last month
- A curated list of resources for NLP (Natural Language Processing) for Hebrew☆109Updated 2 years ago
- Python module for syllabifying English ARPABET transcriptions☆66Updated 6 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆105Updated last month
- ☆34Updated last year
- Suite of web packages for creating interactive ReadAlongs☆14Updated this week
- Domain-specific programming language for linguistic grammars and transducers — Langage dédié pour les grammaires linguistiques et les tra…☆13Updated last week
- Grapheme-to-Phoneme transductions that preserve input and output indices, and support cross-lingual g2p!☆157Updated this week
- ☆72Updated this week
- Multi-Langauge Identification☆29Updated 8 months ago
- state-of-the-art models for diacritics restoration for Arabic language☆11Updated last month
- The EveryVoice TTS Toolkit - Text To Speech for your language☆25Updated last week
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆16Updated this week
- Audiobook alignment for Indigenous languages☆39Updated last month
- sound stretch python module☆11Updated 5 years ago
- HeBERT: Pre-training BERT for modern Hebrew☆76Updated last year
- Model for recasing and repunctuating ASR transcripts☆133Updated 11 months ago
- universal syllabification algorithms☆44Updated 2 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆70Updated 11 months ago