neocl / speachLinks
ππ Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
β20Updated last year
Alternatives and similar repositories for speach
Users that are interested in speach are comparing it to the libraries listed below
Sorting:
- SIGMORPHON 2022 Shared Task on Morpheme Segmentationβ31Updated 2 years ago
- Python Finite-State Toolkitβ60Updated last month
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammarsβ17Updated last year
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β54Updated 2 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)β34Updated 7 months ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β76Updated 2 years ago
- An NLP pipeline for Hebrewβ40Updated 7 months ago
- β50Updated last year
- β45Updated 3 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressionsβ30Updated 5 years ago
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree languageβ16Updated 2 weeks ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentationβ200Updated 5 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translationβ15Updated last year
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict formatβ33Updated 6 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)β74Updated 9 months ago
- SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILPβ14Updated 4 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioningβ35Updated 3 weeks ago
- Gamma Agreement in Pythonβ45Updated last year
- β22Updated 3 years ago
- Unicode Standard tokenization routines and orthography profile segmentationβ38Updated 11 months ago
- List of corpora annotated for coreference for different languagesβ17Updated last year
- NTREX -- News Test References for MT Evaluationβ87Updated last year
- OpusFilter - Parallel corpus processing toolkitβ115Updated last week
- phone inventory libraryβ17Updated 2 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.β160Updated last year
- MultiLexNorm 2021 competition system from ΓFALβ15Updated 4 years ago
- β19Updated 4 years ago
- Featurize words into orthographic and phonological vectors.β41Updated 2 years ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchβ34Updated 3 years ago
- several algorithms for converting dependency structures into constituency structures.β10Updated 3 years ago