neocl / speachLinks
ππ Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, JSON, SQLite, VTT, Audacity, TTL, TIG, ISF, etc.)
β17Updated 11 months ago
Alternatives and similar repositories for speach
Users that are interested in speach are comparing it to the libraries listed below
Sorting:
- Unicode Standard tokenization routines and orthography profile segmentationβ37Updated 3 months ago
- β22Updated 3 years ago
- β19Updated 3 years ago
- A guide to building language technology in new languages.β58Updated 3 years ago
- Proposed splits for the LREC Wikipron paperβ14Updated 5 years ago
- A tiny BERT for low-resource monolingual modelsβ31Updated 8 months ago
- A Language-Independent Unsupervised Morphological Segmentation Framework based on Adaptor Grammarsβ17Updated 11 months ago
- This repo contains a set of neural transducer, e.g. sequence-to-sequence model, focusing on character-level tasks.β75Updated last year
- A python library for easily querying morphological inflection models trained on Unimorphβ13Updated 2 years ago
- SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILPβ14Updated 4 years ago
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)β66Updated last week
- Multilingual Open Textβ25Updated last month
- Python Finite-State Toolkitβ55Updated last week
- These are lists for a variety of languages containing words that are distinctive to each language.β38Updated 3 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressionsβ27Updated 5 years ago
- List of corpora annotated for coreference for different languagesβ17Updated 10 months ago
- MultiLexNorm 2021 competition system from ΓFALβ15Updated 3 years ago
- Corpus preprocessingβ97Updated last year
- This repository includes the code for neural DRS parsingβ27Updated last year
- A simple neural truecaser written in pytorch and allennlp.β33Updated 11 months ago
- several algorithms for converting dependency structures into constituency structures.β10Updated 3 years ago
- MorphyNet: a Large Multilingual Database of Derivational and Inflectional Morphology (+morpheme segmentation)β47Updated 2 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioningβ32Updated last week
- Code and data for the IWSLT 2022 shared task on Formality Control for SLTβ21Updated 2 years ago
- Cog is a tool for comparing languages using lexicostatistics and comparative linguistics techniques.β23Updated last year
- Scripts and tools for doing unsupervised acceptability prediction.β15Updated 2 years ago
- Wiktra - Python tool of Wiktionary Transliteration modules for 514 languages and its 102 different scripts (orthographies)β30Updated 3 years ago
- SHAS: Approaching optimal Segmentation for End-to-End Speech Translationβ38Updated 2 years ago
- English web corpus with 4M tokens and several annotation typesβ26Updated last year
- phone inventory libraryβ16Updated 2 years ago