Official source for Catalan Language Models and resources made within Aina project.
☆26Jul 28, 2023Updated 2 years ago
Alternatives and similar repositories for lm-catalan
Users that are interested in lm-catalan are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- RESTful API for synthesizing speech in catalan☆16Nov 5, 2024Updated last year
- A free & open tool for transcribing audio interviews with offline ASR support☆25Dec 21, 2023Updated 2 years ago
- ☆17Apr 28, 2021Updated 4 years ago
- Deepspeech ASR Model for the Catalan Language☆17Feb 15, 2021Updated 5 years ago
- Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)☆14Jul 9, 2020Updated 5 years ago
- ☆17Mar 1, 2024Updated 2 years ago
- Public domain corpus of Catalan text☆18Dec 20, 2021Updated 4 years ago
- Phonetically-Oriented Word Error Rate☆36May 4, 2019Updated 6 years ago
- A tool to collect/validate audio recordings from workers on Amazon Mechanical Turk. Written in Python/Flask. (originally hosted on github…☆14Dec 19, 2022Updated 3 years ago
- Acoustic and language models for minorised languages.☆26Sep 30, 2020Updated 5 years ago
- Code for "Error-driven Fixed-Budget ASR Personalization for Accented Speakers" in ICASSP 2021☆11Jun 13, 2021Updated 4 years ago
- VoxAngeles Corpus☆13Aug 23, 2025Updated 6 months ago
- Tool for creating Kaldi nnet3 recipes using the International Phonetic Alphabet (IPA)☆10Jun 2, 2021Updated 4 years ago
- ☆10Mar 20, 2021Updated 4 years ago
- ☆11Nov 5, 2021Updated 4 years ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11May 4, 2022Updated 3 years ago
- The project for speech translation☆12Sep 28, 2023Updated 2 years ago
- Thai Grapheme to Phoneme (G2P) Wiktionary Corpus☆13Jul 25, 2022Updated 3 years ago
- This repo contains the baseline model recipes and pre-trained model for GramVanni hindi ASR challenge☆15Mar 26, 2022Updated 3 years ago
- steps to perform text-based speaker diarization with kaldi toolkit☆12Nov 2, 2018Updated 7 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- ☆13Nov 16, 2022Updated 3 years ago
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- Study on lexibank data (presenting the lexibank dataset).☆15Apr 11, 2025Updated 10 months ago
- ☆18Feb 16, 2026Updated 2 weeks ago
- Annotations and scripts for use with University of Wisconsin X-Ray Microbeam Speech Production Database (1994)☆13Oct 8, 2020Updated 5 years ago
- Pre-production releases for Spacy in Catalan☆14Nov 30, 2021Updated 4 years ago
- Coqui STT (🐸STT) based forced alignment tool☆13Feb 24, 2022Updated 4 years ago
- A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …☆16Sep 5, 2017Updated 8 years ago
- This is a mirror of https://gitlab.com/tiro-is/tiro-speech-core☆15Jun 19, 2023Updated 2 years ago
- ☆14Feb 9, 2023Updated 3 years ago
- phone inventory library☆17May 15, 2023Updated 2 years ago
- ☆20Sep 20, 2024Updated last year
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆18Jan 15, 2026Updated last month
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆14Jan 24, 2017Updated 9 years ago
- This will hold the crowdsourcing platform to be used to store voice data from various speakers which will act as input dataset for speech…☆17Mar 6, 2023Updated 3 years ago
- Spacy NLP Model for the Catalan language☆16Nov 21, 2020Updated 5 years ago