neocl / speachLinks
ππ Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
β20Updated last year
Alternatives and similar repositories for speach
Users that are interested in speach are comparing it to the libraries listed below
Sorting:
- SIGMORPHON 2022 Shared Task on Morpheme Segmentationβ31Updated 2 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict formatβ33Updated 6 years ago
- Python Finite-State Toolkitβ60Updated 2 weeks ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammarsβ17Updated last year
- Gamma Agreement in Pythonβ45Updated last year
- Unicode Standard tokenization routines and orthography profile segmentationβ38Updated 10 months ago
- β22Updated 3 years ago
- A guide to building language technology in new languages.β59Updated 3 years ago
- phone inventory libraryβ17Updated 2 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β76Updated 2 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β52Updated 2 years ago
- A tiny BERT for low-resource monolingual modelsβ31Updated 2 weeks ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)β32Updated 6 months ago
- β45Updated 3 years ago
- β19Updated 4 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)β74Updated 9 months ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.β35Updated 9 months ago
- MultiLexNorm 2021 competition system from ΓFALβ15Updated 4 years ago
- NTREX -- News Test References for MT Evaluationβ86Updated last year
- A corpus of diacritized Hebrew texts (ΧΧ§Χ‘Χ ΧΧ ΧΧ§Χ)β11Updated 3 years ago
- List of corpora annotated for coreference for different languagesβ17Updated last year
- β50Updated last year
- Corpus preprocessingβ99Updated last year
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translationβ15Updated last year
- Finite-state script normalization and processing utilitiesβ46Updated 3 weeks ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchβ34Updated 3 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressionsβ30Updated 5 years ago
- An NLP pipeline for Hebrewβ40Updated 6 months ago
- A minimal, pure Python library to interface with CoNLL-U format files.β153Updated last month
- Featurize words into orthographic and phonological vectors.β41Updated 2 years ago