roedoejet / convertextractLinks
Extract and find/replace text based on arbitrary correspondences while preserving original file formatting. This library is a fork from the Textract library by Dean Malmgren.
☆11Updated 2 years ago
Alternatives and similar repositories for convertextract
Users that are interested in convertextract are comparing it to the libraries listed below
Sorting:
- A tool for automatic phoneme transcription☆159Updated 2 years ago
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆16Updated last week
- 🙊 software for creating speech recognition models.☆160Updated last year
- A code for transliterating (romanizing) Arabic text using the American Library Association - Library of Congress (ALA-LC) standard☆49Updated 3 years ago
- A module for normalising text.☆172Updated 4 years ago
- Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.☆292Updated 10 months ago
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆53Updated 2 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆67Updated last week
- Calculates the word error rate of two strings, and the result is written into beautify HTML.☆19Updated 5 years ago
- ☆48Updated 8 years ago
- Offline transcription system for Estonian using Kaldi☆228Updated 3 years ago
- Unsupervised Speaker Clustering & Speaker Recognition☆13Updated 7 years ago
- Automatically exported from code.google.com/p/m2m-aligner☆42Updated 9 years ago
- Crawler for linguistic corpora☆213Updated 5 months ago
- Tools and scripts for working with ELAN☆10Updated 3 years ago
- Script for workflow to add morphological analysis into ELAN files☆14Updated 5 years ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆35Updated 2 years ago
- universal syllabification algorithms☆46Updated 3 years ago
- Arabic Phonetic Dictionary Generator Tool for Automatic Speech Recognition Applications☆12Updated 4 years ago
- Linguistic processing for Common Voice☆58Updated 2 years ago
- SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/☆57Updated 5 months ago
- Python module for syllabifying English ARPABET transcriptions☆72Updated 6 years ago
- PHOIBLE data and development.☆141Updated last year
- Massively multilingual pronunciation mining☆361Updated 3 weeks ago
- Unicode Standard tokenization routines and orthography profile segmentation☆39Updated 11 months ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆58Updated last year
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 6 years ago
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆42Updated last week
- Corpus of oral arguments (recorded speech + official transcripts) of the United States Supreme Court☆22Updated 3 years ago