khanld / chunkformer
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
☆28Updated 2 weeks ago
Alternatives and similar repositories for chunkformer:
Users that are interested in chunkformer are comparing it to the libraries listed below
- VietTTS: An Open-Source Vietnamese Text to Speech☆52Updated 4 months ago
- finetune llm part for spark-tts model☆65Updated last month
- A High-Quality and Large-Scale Dataset for English-Vietnamese Speech Translation (INTERSPEECH 2022)☆21Updated 9 months ago
- This is an implementation for train hifigan part of XTTSv2 model using Coqui/TTS.☆76Updated 5 months ago
- Vi_G2P or ViG2P: G2P package for Vietnamese: based on vPhon and phonology knowledge to convert Raw text - Graphoneme to IPA☆84Updated 10 months ago
- ☆46Updated 8 months ago
- Transformation spoken text to written text☆30Updated 11 months ago
- Vietnamese Punctuation Prediction using Pretrained Language Models☆13Updated 3 years ago
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM☆39Updated last year
- python script to download & process data to train a speech-synthesis model of Vietnamese M.C. Nguyễn Ngọc Ngạn☆13Updated 8 months ago
- Official repo for the Vietnam-Celeb dataset☆20Updated last year
- Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syl…☆59Updated 4 months ago
- ☆31Updated last month
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆39Updated last month
- Code repository for FreGrad☆51Updated 11 months ago
- ☆12Updated 3 months ago
- Conformer RNN-Transducer☆15Updated 2 years ago
- ☆38Updated 7 months ago
- Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the Whisper-medium, designed to enhance performance on mul…☆15Updated 3 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆12Updated last month
- Official Repository For VoxBlink2☆67Updated 8 months ago
- Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis☆24Updated last month
- Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995☆76Updated 5 months ago
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆35Updated 7 months ago
- Implementation of "A conformer-based classifier for variable-length utterance processing in anti-spoofing" published in Interspeech 2023.☆23Updated last year
- End-to-End Vietnamese Speech Recognition using wav2vec 2.0☆98Updated 3 years ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆42Updated last year
- SSL Layerwise analysis for speech deepfake detection☆22Updated 2 months ago
- ☆50Updated last month
- Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.☆22Updated 2 months ago