huggingface / audio-transformers-course
The Hugging Face Course on Transformers for Audio
☆313Updated last month
Related projects: ⓘ
- ☆342Updated 6 months ago
- Place where folks can contribute to 🤗 community events☆394Updated 9 months ago
- ☆241Updated 3 months ago
- ☆431Updated 2 months ago
- Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch☆592Updated 7 months ago
- Fine-tune and evaluate Whisper models for Automatic Speech Recognition (ASR) on custom datasets or datasets from huggingface.☆219Updated last year
- Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.☆551Updated last year
- HF's ML for Audio study group☆182Updated last year
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.☆309Updated last week
- Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".☆360Updated 4 months ago
- ☆272Updated 2 weeks ago
- Audio Dataset for training CLAP and other models☆615Updated 7 months ago
- The Open Source Code of UniAudio☆509Updated last month
- NeMo text processing for ASR and TTS☆266Updated this week
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation☆353Updated last year
- Finetune VITS and MMS using HuggingFace's tools☆110Updated 5 months ago
- Contrastive Language-Audio Pretraining☆1,335Updated 2 months ago
- Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate☆348Updated last week
- Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.☆541Updated last month
- Learning audio concepts from natural language supervision☆458Updated 3 months ago
- Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis☆762Updated last month
- Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event …☆312Updated 6 months ago
- ☆244Updated 6 months ago
- Unified-Modal Speech-Text Pre-Training for Spoken Language Processing☆1,161Updated 4 months ago
- WavJourney: Compositional Audio Creation with LLMs☆513Updated 11 months ago
- Official PyTorch implementation of BigVGAN (ICLR 2023)☆841Updated 2 weeks ago
- ☆149Updated last year
- An Audio Language model for Audio Tasks☆281Updated 5 months ago
- Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.☆701Updated 2 months ago
- Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector☆411Updated 3 weeks ago