HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
☆470Sep 20, 2023Updated 2 years ago
Alternatives and similar repositories for huggingsound
Users that are interested in huggingsound are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆203Feb 22, 2022Updated 4 years ago
- Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding☆75Oct 11, 2021Updated 4 years ago
- Segment an audio file and obtain utterance alignments. (Python package)☆346May 15, 2024Updated last year
- ☆357Mar 17, 2024Updated 2 years ago
- ASRecognition: just an easy-to-use library for Automatic Speech Recognition.☆50Mar 6, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- An easy way to fine-tune Wav2Vec 2.0 for low-resource languages.☆80May 20, 2023Updated 2 years ago
- ☆40Jan 14, 2022Updated 4 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆53Dec 6, 2022Updated 3 years ago
- 56 language, 1 model Multilingual ASR☆24Jul 25, 2021Updated 4 years ago
- Using YouTube to prepare a speech recognition dataset for any language☆10Mar 30, 2021Updated 5 years ago
- Code for the winning solution in the SE&R 2022 Challenge - SER track.☆16Mar 28, 2023Updated 3 years ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- A toolset for easy formant extraction and visualization from wav files and TTS models☆33Sep 2, 2022Updated 3 years ago
- Grapheme to phoneme conversion with deep learning.☆425Dec 8, 2023Updated 2 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Zero-shot multimodal punctuation insertion and truecasing using Whisper☆119Feb 4, 2023Updated 3 years ago
- ICASSP 2023 Accepted☆190May 6, 2024Updated last year
- Avocodo: Generative Adversarial Network for Artifact-free Vocoder☆122Jul 14, 2022Updated 3 years ago
- ☆163Sep 19, 2022Updated 3 years ago
- A live speech recognition using Facebooks wav2vec 2.0 model.☆378Feb 4, 2024Updated 2 years ago
- Official code for Wav2Seq☆97Jul 19, 2022Updated 3 years ago
- PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline☆432Apr 19, 2023Updated 2 years ago
- NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]☆25Jul 5, 2022Updated 3 years ago
- A library of speech gadgets.☆14Oct 15, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration…☆328Sep 24, 2022Updated 3 years ago
- ☆76Oct 25, 2021Updated 4 years ago
- Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode☆111Aug 31, 2022Updated 3 years ago
- 🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation☆262Nov 15, 2025Updated 4 months ago
- Docker image and scripts for training finetuned or completely personal Kaldi speech models. Particularly for use with kaldi-active-gramma…☆21Jan 24, 2022Updated 4 years ago
- A fast and lightweight python-based CTC beam search decoder for speech recognition.☆469Jul 13, 2023Updated 2 years ago
- Wav2vec resources and models for Brazilian Portuguese☆37Jul 15, 2022Updated 3 years ago
- A library for speech data augmentation in time-domain☆684Aug 30, 2021Updated 4 years ago
- Tools for handling multimodal data in machine learning projects.☆1,121Mar 23, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆97Nov 20, 2024Updated last year
- ☆55Jan 13, 2023Updated 3 years ago
- ☆57Dec 19, 2022Updated 3 years ago
- speech to text with self-supervised learning based on wav2vec 2.0 framework☆379Nov 22, 2021Updated 4 years ago
- Multilingual G2P in 100 languages☆379May 26, 2023Updated 2 years ago
- Implementation of different noise embeddings for noise aware training of Kaldi acoustic models.☆13Feb 13, 2021Updated 5 years ago
- ☆62Apr 11, 2023Updated 2 years ago