Softcatala / ca-text-corpusLinks
Public domain corpus of Catalan text
☆18Updated 3 years ago
Alternatives and similar repositories for ca-text-corpus
Users that are interested in ca-text-corpus are comparing it to the libraries listed below
Sorting:
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11Updated 3 years ago
- Study on lexibank data (presenting the lexibank dataset).☆15Updated 8 months ago
- Cross-Linguistic Transcription Systems☆16Updated 11 months ago
- Python Finite-State Toolkit☆60Updated 3 weeks ago
- Official source for Catalan Language Models and resources made within Aina project.☆25Updated 2 years ago
- A Python package for processing research with Minimalist grammars☆21Updated 4 years ago
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆16Updated this week
- Apertium linguistic data for Catalan☆11Updated 2 weeks ago
- Neural based model for automatic diacritics restoration.☆25Updated 7 years ago
- Jason Riggle's chart of phonological features in JSON format + extras☆54Updated last year
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 3 years ago
- VoxAngeles Corpus☆13Updated 3 months ago
- Breaks a word into syllables using an LSTM-based neural network.☆20Updated 2 years ago
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆51Updated 2 years ago
- ☆22Updated 3 years ago
- Domain-specific programming language for linguistic grammars and transducers — Langage dédié pour les grammaires linguistiques et les tra…☆16Updated this week
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆14Updated 2 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆65Updated 2 weeks ago
- Bunachar Náisiúnta Moirfeolaíochta | Irish National Morphology Database☆26Updated last year
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 6 years ago
- phone inventory library☆17Updated 2 years ago
- Language Acquisition Research Tools☆43Updated 3 weeks ago
- Acoustic and language models for minorised languages.☆26Updated 5 years ago
- A repository containing links to useful phonological software☆12Updated 2 years ago
- 🐍🍑 Python 3 library for managing, annotating, and converting natural language corpuses using popular formats (CoNLL, ELAN, Praat, CSV, …☆20Updated last year
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- Wiktionary parser tool for many language editions.☆54Updated 3 years ago
- Grapheme to phoneme converter for Estonian☆14Updated 4 years ago
- Audiobook alignment for Indigenous languages☆45Updated last week
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆35Updated 2 years ago