codegram / calbertLinks
Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)
☆14Updated 5 years ago
Alternatives and similar repositories for calbert
Users that are interested in calbert are comparing it to the libraries listed below
Sorting:
- The RadioTalk dataset of talk radio transcripts☆61Updated 4 years ago
- Pre-production releases for Spacy in Catalan☆14Updated 4 years ago
- Official source for Catalan Language Models and resources made within Aina project.☆25Updated 2 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆24Updated 4 years ago
- ☆17Updated last year
- phone inventory library☆17Updated 2 years ago
- Simple text to phonemes converter for multiple languages☆20Updated 3 years ago
- A crash course for training speech recognition models using DeepSpeech.☆24Updated 4 years ago
- Gamma Agreement in Python☆45Updated last year
- Using YouTube to prepare a speech recognition dataset for any language☆10Updated 4 years ago
- ☆75Updated 4 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- Artie Bias Corpus: an audio corpus + code for detecting demographic bias☆20Updated 5 years ago
- Open Source AI Benchmarking toolkit for benchmarking speech to text services☆58Updated last year
- Extremely easy to use sequence to sequence library with attention, for text to text conversion tasks.☆39Updated 5 years ago
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- Deepspeech ASR Model for the Catalan Language☆17Updated 4 years ago
- Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.☆130Updated 4 years ago
- Forced Alignments for Common Voice☆31Updated 5 years ago
- 🐸TTS recipes for different datasets☆86Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- ☆15Updated 6 years ago
- Experiments with Hugging Face 🔬 🤗☆44Updated last year
- Visualize large text collections with WebGL☆26Updated last year
- Spoken Language Identification on Common Voice and AudioSet using Deep Learning☆40Updated 3 years ago
- Master's thesis project in collaboration with Rasa, focusing on knowledge distillation from BERT into different very small networks and a…☆13Updated 3 years ago
- 🦁 Nala is an agile open-source voice assistant framework (20+ actions).☆35Updated 2 years ago
- Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework☆48Updated 2 years ago
- MaSS - Multilingual corpus of Sentence-aligned Spoken utterances☆50Updated last year
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 6 years ago