HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
☆470Sep 20, 2023Updated 2 years ago
Alternatives and similar repositories for huggingsound
Users that are interested in huggingsound are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆205Feb 22, 2022Updated 4 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Oct 11, 2021Updated 4 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆347May 15, 2024Updated 2 years ago
- ☆357Mar 17, 2024Updated 2 years ago
- ASRecognition: just an easy-to-use library for Automatic Speech Recognition.☆50Mar 6, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 3 years ago
- ☆41Jan 14, 2022Updated 4 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 5 years ago
- 56 language, 1 model Multilingual ASR☆24Jul 25, 2021Updated 4 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 3 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 4 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.☆425Dec 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆120Feb 4, 2023Updated 3 years ago
- ICASSP 2023 Accepted☆190May 6, 2024Updated 2 years ago
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆122Jul 14, 2022Updated 3 years ago
- ☆163Sep 19, 2022Updated 3 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆378Feb 4, 2024Updated 2 years ago
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆432Apr 19, 2023Updated 3 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆25Jul 5, 2022Updated 3 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆328Sep 24, 2022Updated 3 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Aug 31, 2022Updated 3 years ago
- 🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation☆264Nov 15, 2025Updated 6 months ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆21Jan 24, 2022Updated 4 years ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆468Jul 13, 2023Updated 2 years ago
- Wav2vec resources and models for Brazilian Portuguese☆36Jul 15, 2022Updated 3 years ago
- Tools for handling multimodal data in machine learning projects.☆1,128Updated this week
- A library for speech data augmentation in time-domain☆687Aug 30, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- A differentiable version of SPTK☆200May 18, 2026Updated last week
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- ☆55Jan 13, 2023Updated 3 years ago
- ☆57Dec 19, 2022Updated 3 years ago
- speech to text with self-supervised learning based on wav2vec 2.0 framework☆380Nov 22, 2021Updated 4 years ago
- Multilingual G2P in 100 languages☆384May 26, 2023Updated 3 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago