neocl / speach
ππ Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
β17Updated 10 months ago
Alternatives and similar repositories for speach
Users that are interested in speach are comparing it to the libraries listed below
Sorting:
- Python Finite-State Toolkitβ54Updated last week
- List of corpora annotated for coreference for different languagesβ17Updated 9 months ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forestsβ41Updated 2 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressionsβ27Updated 4 years ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammarsβ17Updated 11 months ago
- A python library for easily querying morphological inflection models trained on Unimorphβ13Updated 2 years ago
- A tiny BERT for low-resource monolingual modelsβ31Updated 7 months ago
- Proposed splits for the LREC Wikipron paperβ14Updated 5 years ago
- SIGMORPHON 2022 Shared Task on Morpheme Segmentationβ26Updated 2 years ago
- Python framework for processing Universal Dependencies dataβ57Updated last week
- A guide to building language technology in new languages.β58Updated 3 years ago
- English web corpus with 4M tokens and several annotation typesβ26Updated last year
- β22Updated 3 years ago
- MultiLexNorm 2021 competition system from ΓFALβ15Updated 3 years ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β75Updated last year
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)β30Updated 3 years ago
- MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinkiβ23Updated 3 months ago
- several algorithms for converting dependency structures into constituency structures.β10Updated 3 years ago
- Open-source tools for morphological tagging, segmentation and stemming.β40Updated 5 years ago
- Multilingual Open Textβ25Updated last week
- A simple neural truecaser written in pytorch and allennlp.β33Updated 11 months ago
- Unicode Standard tokenization routines and orthography profile segmentationβ37Updated 2 months ago
- Bilingual sentence similarity classifier using Tensorflowβ21Updated 5 years ago
- A collection of English tweets annotated in Universal Dependencies.β39Updated 3 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.β38Updated 3 years ago
- A python true casing utility that restores case information for textsβ88Updated 2 years ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLTβ21Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.β31Updated 2 months ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β45Updated 2 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioningβ33Updated 2 weeks ago