Softcatala / ca-text-corpusLinks
Public domain corpus of Catalan text
☆18Updated 4 years ago
Alternatives and similar repositories for ca-text-corpus
Users that are interested in ca-text-corpus are comparing it to the libraries listed below
Sorting:
- Study on lexibank data (presenting the lexibank dataset).☆15Updated 9 months ago
- Cross-Linguistic Transcription Systems☆16Updated last year
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11Updated 3 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 6 years ago
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆16Updated last week
- Jason Riggle's chart of phonological features in JSON format + extras☆54Updated last year
- Grapheme to phoneme converter for Estonian☆14Updated 4 years ago
- Breaks a word into syllables using an LSTM-based neural network.☆20Updated 2 years ago
- A Python package for processing research with Minimalist grammars☆21Updated 4 years ago
- SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/☆57Updated 4 months ago
- A repository containing links to useful phonological software☆12Updated 2 years ago
- Gamma Agreement in Python☆45Updated last year
- VoxAngeles Corpus☆13Updated 4 months ago
- Faster, modernized fork of the language identification tool langid.py☆61Updated last year
- Feature set algebra for linguistics☆17Updated last week
- Data from a corpus of written Hawaiian☆17Updated 9 years ago
- Unicode Standard tokenization routines and orthography profile segmentation☆38Updated 10 months ago
- A lemmatizer for Icelandic text☆17Updated 7 years ago
- Small-vocabulary neural sequence-to-sequence generation with optional feature conditioning☆35Updated this week
- ☆10Updated 4 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 3 years ago
- universal syllabification algorithms☆45Updated 3 years ago
- Python Finite-State Toolkit☆60Updated 2 weeks ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆66Updated last week
- phone inventory library☆17Updated 2 years ago
- OCTRA is a web-application for the orthographic transcription of audio files.☆39Updated last month
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- Featurize words into orthographic and phonological vectors.☆41Updated 2 years ago
- A family of efficient speech models for multilingual phone recognition☆33Updated 2 months ago
- The Grammar Matrix☆15Updated last month