ronggong/interspeech2018_submission01

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ronggong/interspeech2018_submission01)

ronggong / interspeech2018_submission01

Supplementary information and code for INTERSPEECH 2018 paper: Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions

☆46

Alternatives and similar repositories for interspeech2018_submission01

Users that are interested in interspeech2018_submission01 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

felixkreuk / SegFeat
View on GitHub
Phoneme Boundary Detection using Learnable Segmental Features (ICASSP 2020)
☆83Nov 13, 2021Updated 4 years ago
hiromu / VoiceConversion
View on GitHub
Voice conversion tools for STRAIGHT
☆29Jul 17, 2020Updated 6 years ago
CODEJIN / MLPSinger
View on GitHub
☆24Mar 15, 2022Updated 4 years ago
itec-hust / MusicYOLO
View on GitHub
MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.
☆18Jan 29, 2022Updated 4 years ago
sp-nitech / DNN-HSMM
View on GitHub
pytorch implementation of DNN-HSMM for TTS
☆71Mar 14, 2021Updated 5 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
schufo / plla-tisvs
View on GitHub
Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation
☆24Nov 8, 2021Updated 4 years ago
MontrealCorpusTools / MFA-reorganization-scripts
View on GitHub
Collection of scripts and utilities for reorganizing corpora to use with the Montreal Forced Aligner
☆43Jun 22, 2021Updated 5 years ago
jonysugianto / vad_lsfm
View on GitHub
Efficient voice activity detection algorithm using long-term spectral flatness measurement
☆15Feb 21, 2017Updated 9 years ago
chomeyama / UnifiedSourceFilterGAN
View on GitHub
☆20Jun 5, 2022Updated 4 years ago
nii-yamagishilab / speaker_sex_attribute_privacy
View on GitHub
Project for HIDING SPEAKER’S SEX IN SPEECH USING ZERO-EVIDENCE SPEAKER REPRESENTATION IN AN ANALYSIS/SYNTHESIS PIPELINE
☆15Nov 30, 2022Updated 3 years ago
kylerbrown / textgrid
View on GitHub
simple textgrid to csv converter
☆27Jul 29, 2021Updated 5 years ago
zizyzhang / DNN-Based-Singing-Voice-Synthesis
View on GitHub
DNN based singing voice synthesis
☆17Oct 15, 2018Updated 7 years ago
lingjzhu / charsiu
View on GitHub
Charsiu: A neural phonetic aligner.
☆347Sep 19, 2022Updated 3 years ago
zqs01 / data2vecnoisy
View on GitHub
☆11Oct 20, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
kyungyunlee / ismir2018-revisiting-svd
View on GitHub
Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook
☆69Nov 21, 2022Updated 3 years ago
open-speech / speech-aligner
View on GitHub
speech-aligner，是一个从“人声语音”及其“语言文本”，产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech an…
☆410Apr 8, 2020Updated 6 years ago
alphacep / whisper-prompts
View on GitHub
OpenAI Whisper Prompt Examples
☆53Jul 17, 2023Updated 3 years ago
bastibe / MAPS-Scripts
View on GitHub
A fundamental frequency estimation algorithm using features from the magnitude and phase spectrogram.
☆25Mar 29, 2021Updated 5 years ago
huangruizhe / audio
View on GitHub
Data manipulation and transformation for audio signal processing, powered by PyTorch
☆10Sep 30, 2024Updated last year
MattShannon / HTS-demo_CMU-ARCTIC-SLT-STRAIGHT-AR-decision-tree
View on GitHub
Autoregressive HMM version of the HTS demo for statistical speech synthesis (includes autoregressive clustering)
☆16Sep 12, 2014Updated 11 years ago
dathudeptrai / FastSpeech2
View on GitHub
A Tensorflow Implementation of the FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
☆11Aug 12, 2020Updated 5 years ago
lusensama / Obamanet_retrain
View on GitHub
ObamaNet fork
☆12Sep 16, 2019Updated 6 years ago
superbock / ISMIR2020
View on GitHub
Supplementary material for the ISMIR 2020 paper: “Deconstruct, Analyse, Reconstruct: how to improve tempo, beat, and downbeat estimation”…
☆12Mar 2, 2021Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
amirharati / kaldi-alligner
View on GitHub
scripts to align a given wave to its transcription using trained models by Kaldi
☆37Aug 15, 2019Updated 6 years ago
bill317996 / Melody-extraction-with-melodic-segnet
View on GitHub
The source code of "A Streamlined Encoder/Decoder Architecture for Melody Extraction"
☆74Feb 10, 2020Updated 6 years ago
shamidreza / dnnmapper
View on GitHub
Mapping features using Deep Neural Networks (DNNs) with application to Voice Conversion (VC). The implementations are on top of Theano Py…
☆32May 30, 2018Updated 8 years ago
chorowski-lab / hCPC
View on GitHub
Implementation of multi-level Contrastive Predictive Coding (CPC) methods
☆20Jan 12, 2023Updated 3 years ago
danpovey / conditional-flow-matching
View on GitHub
☆29Aug 8, 2024Updated last year
jasonppy / syllable-discovery
View on GitHub
Syllable Segmentation and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model
☆35Aug 27, 2023Updated 2 years ago
npuichigo / extract_features_using_world
View on GitHub
using world vocoder to extract features and make data for training neural networks
☆11Oct 9, 2017Updated 8 years ago
makerjackie / MTTS
View on GitHub
A Demo of Mandarin/Chinese TTS frontend
☆284Apr 18, 2022Updated 4 years ago
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SpeechColab / GigaSpeechBench
View on GitHub
☆29Jul 21, 2026Updated last week
the-bird-F / GLM-Voice-RAG
View on GitHub
[EMNLP 2025 Findings] A complete cross-modal RAG system for end-to-end speech-to-speech large models, including ASR-based Retrieval and E…
☆31Jul 11, 2025Updated last year
idiap / apam
View on GitHub
APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative train…
☆14Feb 15, 2021Updated 5 years ago
RickyL-2000 / AlignSTS
View on GitHub
Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment
☆68Jul 5, 2024Updated 2 years ago
PlayVoice / BigVGAN
View on GitHub
BigVGAN with Neural Source-Filter
☆58Sep 21, 2023Updated 2 years ago
biggytruck / SpeechSplit2
View on GitHub
Official implementation of SpeechSplit2
☆135Oct 22, 2022Updated 3 years ago
felixkreuk / UnsupSeg
View on GitHub
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
☆146Aug 5, 2022Updated 3 years ago