Softcatala / ca-text-corpus
Public domain corpus of Catalan text
☆16Updated 3 years ago
Alternatives and similar repositories for ca-text-corpus:
Users that are interested in ca-text-corpus are comparing it to the libraries listed below
- Austronesian Comparative Dictionary☆12Updated 3 weeks ago
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆15Updated this week
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆46Updated last year
- Cython wrapper on Hunspell Dictionary☆66Updated 7 months ago
- Deepspeech ASR Model for the Catalan Language☆17Updated 3 years ago
- A Python toolkit converting pronunciation in enwiktionary xml dump to cmudict format☆33Updated 5 years ago
- Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)☆10Updated 4 years ago
- A repository containing links to useful phonological software☆11Updated last year
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated 11 months ago
- linguistic data on the Yongning Na language☆7Updated this week
- The curation repository for the data behind Concepticon.☆37Updated this week
- ☆31Updated 3 years ago
- Python Finite-State Toolkit☆48Updated last week
- ☆10Updated 3 years ago
- now you can even use apertium from python☆31Updated 11 months ago
- Cross-Linguistic Transcription Systems☆14Updated last month
- Official source for Catalan Language Models and resources made within Aina project.☆23Updated last year
- Generation of bilingual dictionaries from Wiktionary/dbnary data for the WikDict project☆46Updated 3 months ago
- German lemmatization with IWNLP as extension for spaCy☆24Updated last year
- AUTOTYP data export☆41Updated last year
- universal syllabification algorithms☆44Updated 2 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Wiktionary parser tool for many language editions.☆54Updated 2 years ago
- A versioned python wrapper package for cmudict (https://github.com/cmusphinx/cmudict).☆61Updated 3 weeks ago
- A lexicon compiler for non-suffixational morphologies☆11Updated last month
- Semantic spaces in python☆14Updated last year
- ☆10Updated last year
- German Morphological Analyzer☆47Updated 3 years ago
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆48Updated last year
- Python script to convert .srt subtitle files to Praat .textgrid files☆17Updated 6 months ago