MingLunHan / X-LLM-Speech

☆10

Related projects: ⓘ

LitLeo / 3m-asr-inference
☆13Updated this week
nervjack2 / Speech2Unit
☆11Updated 2 weeks ago
zeyuxie29 / AudioTime
☆22Updated 2 months ago
openaudiolab / LLaST
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
☆11Updated last month
ttslr / MonTTS
☆13Updated 2 years ago
audiodemo / voice-conversion
Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
☆17Updated last year
wenet-e2e / wenet-tts
☆13Updated this week
wenet-e2e / WeSpeech-AI
Open Source Speech/Text Data on AI
☆18Updated 2 years ago
0nutation / SLMTokBench
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆32Updated last year
karchkha / MelSpec_GPT_VQVAE
Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms
☆18Updated 11 months ago
mct10 / CoBERT
Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
☆46Updated 10 months ago
wentaozhu / speechnas
SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification
☆30Updated last year
SpeechColab / PySpeechColab
A library of speech gadgets.
☆13Updated last year
lifeiteng / VoiceBox
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
☆25Updated last year
Hertin / WavPrompt
☆35Updated 2 years ago
Infinity-INF / fast-phasr
Phonemes and durations labeling based on whisper small
☆12Updated 2 months ago
NeuroWave-ai / CUCVAE-TTS
☆25Updated 2 years ago
VITA-Group / Audio-Lottery
[ICLR 2022] "Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable", by Shaojin Ding, Tianlong Chen, Z…
☆30Updated 2 years ago
RicherMans / SAT
Streaming Audiotransformers for online Audio tagging
☆39Updated 3 months ago
ex3ndr / supervoice-librilight-preprocessed
60k hours of phoneme-aligned audio from audio books
☆18Updated last month
speechnovateur / languagecodec_tmp
Temporary anonymous version
☆22Updated 6 months ago
Sreyan88 / LipGER
Code for InterSpeech 2024 Paper: LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition
☆10Updated 2 months ago
shengcanxu / canoSpeech
text to speech
☆10Updated 6 months ago
ex3ndr / supervoice-hybrid
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
☆27Updated last month
MingLunHan / CIF-ColDec
[ICASSP 2022] Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection
☆23Updated last year
amphionspace / tts-evaluation
An evaluation set for large-scale trained TTS models (Coming in Sep 2024)
☆10Updated 2 weeks ago
AbrahamSanders / codec-bpe
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
☆33Updated last week
ictnlp / ComSpeech
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
☆21Updated 2 months ago
ga642381 / SpeechGen
《SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts》
☆74Updated last year