AlonKellner / waloviz
An open source interactive spectrogram audio player, primarily based on bokeh and the holoviz stack (wav+holoviz=waloviz)
☆65Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for waloviz
- Audio tokenization, in the fastest way possible!☆45Updated 2 months ago
- GPT-style network for phonemization with durations of text☆62Updated 7 months ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆50Updated 3 years ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.☆52Updated 8 months ago
- Local emulator for Hugging Face Inference Endpoints customer handlers☆24Updated last year
- ☆21Updated last year
- ☆37Updated 4 months ago
- Tunable pipelines☆29Updated 3 weeks ago
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆28Updated 3 weeks ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆49Updated last week
- ☆84Updated 7 months ago
- An espeak-compatible, permissively-licensed IPA phonemizer (G2P) based on DeepPhonemizer. Usable as a drop-in replacement for espeak's GP…☆83Updated last month
- Recaption large (Web)Datasets with vllm and save the artifacts.☆30Updated last month
- GPT for FACodec☆13Updated 7 months ago
- ☆12Updated 10 months ago
- A dashboard for exploring timm learning rate schedulers☆18Updated last year
- ☆16Updated 2 years ago
- ☆52Updated 2 weeks ago
- ☆12Updated last year
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆12Updated 5 months ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆13Updated 3 months ago
- PyTorch video decoding☆77Updated this week
- Implementation of a Light Recurrent Unit in Pytorch☆46Updated last month
- 🤝 Trade any tensors over the network☆30Updated last year
- A starter kit for evaluating benchmarks on the 🤗 Hub☆13Updated 10 months ago
- ☆23Updated last year
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated this week
- Official Code for ParrotTTS☆42Updated 3 weeks ago
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆36Updated last year
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 2 months ago
- A Python library and CLI tool to do automatic syllabification of Spanish words☆14Updated 4 years ago