jonatasgrosman / huggingsoundView external linksLinks
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
☆469Sep 20, 2023Updated 2 years ago
Alternatives and similar repositories for huggingsound
Users that are interested in huggingsound are comparing it to the libraries listed below
Sorting:
- Segment an audio file and obtain utterance alignments. (Python package)☆345May 15, 2024Updated last year
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Oct 11, 2021Updated 4 years ago
- ☆357Mar 17, 2024Updated last year
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 2 years ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆122Jul 14, 2022Updated 3 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Feb 4, 2023Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.☆419Dec 8, 2023Updated 2 years ago
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 2 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- ICASSP 2023 Accepted☆189May 6, 2024Updated last year
- 56 language, 1 model Multilingual ASR☆24Jul 25, 2021Updated 4 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- ☆163Sep 19, 2022Updated 3 years ago
- 🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation☆262Nov 15, 2025Updated 3 months ago
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆432Apr 19, 2023Updated 2 years ago
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆328Sep 24, 2022Updated 3 years ago
- ☆55Jan 13, 2023Updated 3 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- Multilingual G2P in 100 languages☆374May 26, 2023Updated 2 years ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆467Jul 13, 2023Updated 2 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆376Feb 4, 2024Updated 2 years ago
- Transcribing Speech with Multinomial Diffusion, training code and models.☆80Sep 27, 2023Updated 2 years ago
- Unofficial implementation of HiFi-GAN+ from the paper "Bandwidth Extension is All You Need" by Su, et al.☆223Oct 20, 2023Updated 2 years ago
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆96Nov 20, 2024Updated last year
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 4 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Acoustic models for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion☆104Jul 12, 2023Updated 2 years ago
- Wav2vec resources and models for Brazilian Portuguese☆36Jul 15, 2022Updated 3 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆25Jul 5, 2022Updated 3 years ago
- Open Source Speech/Text Data on AI☆19Sep 13, 2022Updated 3 years ago
- Tools for handling multimodal data in machine learning projects.☆1,111Feb 2, 2026Updated 2 weeks ago
- VoiceSplit: Targeted Voice Separation by Speaker-Conditioned Spectrogram☆265Jul 25, 2024Updated last year
- Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.☆717Oct 23, 2023Updated 2 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Aug 31, 2022Updated 3 years ago
- ☆56Dec 19, 2022Updated 3 years ago
- A tokenizer, text cleaner, and phonemizer for many human languages.☆331Nov 15, 2024Updated last year
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates @ INTERSPEECH 2022☆307Sep 16, 2023Updated 2 years ago