Softcatala / ca-text-corpus
Public domain corpus of Catalan text
☆16Updated 3 years ago
Alternatives and similar repositories for ca-text-corpus
Users that are interested in ca-text-corpus are comparing it to the libraries listed below
Sorting:
- Apertium linguistic data for Catalan☆11Updated this week
- Catalan bert model☆12Updated 4 years ago
- Jason Riggle's chart of phonological features in JSON format + extras☆53Updated 10 months ago
- universal syllabification algorithms☆44Updated 2 years ago
- Study on lexibank data (presenting the lexibank dataset).☆13Updated last month
- Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)☆14Updated 4 years ago
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆16Updated last year
- Python for Linguists – a Gentle Introduction to Programming☆45Updated 9 years ago
- Python Finite-State Toolkit☆54Updated last week
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- CogNet: a large-scale, high-quality cognate database for 338 languages, 1.07M words, and 8.1 million cognates☆48Updated last year
- Cross-Linguistic Transcription Systems☆14Updated 5 months ago
- Deepspeech ASR Model for the Catalan Language☆17Updated 4 years ago
- Official source for Catalan Language Models and resources made within Aina project.☆24Updated last year
- Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Plains Cree language☆16Updated this week
- documentation for things like relations and parts of speech used by wordnets☆13Updated 11 months ago
- Recipes for cooking with CLDF data☆17Updated 5 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆63Updated last year
- The curation repository for the data behind Concepticon.☆38Updated 2 weeks ago
- A list of resources for conservation, development, and documentation of endangered, minority, and low or under-resourced human languages.☆34Updated 2 years ago
- A Python package for processing research with Minimalist grammars☆21Updated 3 years ago
- ☆28Updated this week
- Cog is a tool for comparing languages using lexicostatistics and comparative linguistics techniques.☆23Updated last year
- Bilingual sentence aligner (Gale & Church, 1993)☆14Updated 6 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies☆15Updated 5 years ago
- Tools and scripts for working with ELAN☆10Updated 2 years ago
- Data for the International Phonetic Alphabet (IPA)☆28Updated 2 years ago
- Scansion tool for Spanish texts☆12Updated last year
- Pre-production releases for Spacy in Catalan☆14Updated 3 years ago