seyong92 / phoneme-informed-note-level-singing-transcriptionView external linksLinks
A pretrained model for "A Phoneme-informed Neural Network Model for Note-level Singing Transcription", ICASSP 2023
☆38Sep 9, 2023Updated 2 years ago
Alternatives and similar repositories for phoneme-informed-note-level-singing-transcription
Users that are interested in phoneme-informed-note-level-singing-transcription are comparing it to the libraries listed below
Sorting:
- MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.☆15Jan 29, 2022Updated 4 years ago
- The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"☆62Jan 7, 2026Updated last month
- Codes for ICASSP 2024 paper: BEAST: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. An online beat tracking syste…☆41Sep 11, 2024Updated last year
- Source code for "Learning Similarity Metrics for Melody Retrieval"☆29Oct 29, 2019Updated 6 years ago
- Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms☆18Oct 8, 2023Updated 2 years ago
- The MIR-MLPop dataset and the official implementation of the paper "MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics …☆32Apr 22, 2024Updated last year
- Machine learning tools and framework for automatic music transcription.☆36Jun 17, 2024Updated last year
- ☆19Feb 2, 2023Updated 3 years ago
- VOCANO: A note transcription framework for singing voice in polyphonic music☆72Aug 9, 2021Updated 4 years ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆33Sep 9, 2025Updated 5 months ago
- accompanying code for my ICASSP2021 paper☆18Jan 6, 2022Updated 4 years ago
- PyTorch implementation of DiffRoll, a diffusion-based generative automatic music transcription (AMT) model☆80Dec 6, 2023Updated 2 years ago
- This repository is for an implementation of the published paper "Translating Melody to Chord: Structured and Flexible Harmonization of Me…☆21May 19, 2025Updated 8 months ago
- Robust Singing Voice Transcription and MIDI Extraction☆109Nov 20, 2024Updated last year
- ☆17Jun 24, 2025Updated 7 months ago
- Phonemes and durations labeling based on whisper small☆11Jul 7, 2024Updated last year
- ☆12Feb 3, 2026Updated 2 weeks ago
- An evolutionary algorithm that generates an accompaniment to a given melody that consists of triad chords while following music theory ru…☆10Sep 19, 2022Updated 3 years ago
- Official implementation of SawSing (ISMIR'22)☆272Aug 28, 2022Updated 3 years ago
- A Java project which is able to split MIDI performance data into monophonic voices.☆23Aug 26, 2020Updated 5 years ago
- This is an unofficial implementation of universal melgan according to https://arxiv.org/abs/2011.09631☆23Aug 15, 2022Updated 3 years ago
- ONSETS&VELOCITIES real-time piano detection - PyTorch training [EUSIPCO2023]☆28Aug 31, 2023Updated 2 years ago
- metadata for SHS100K☆24Dec 25, 2017Updated 8 years ago
- The source code for the paper XiaoiceSing2 (interspeech2023)☆49Jan 15, 2024Updated 2 years ago
- Similarity Learning applied to Speaker Verification and Semantic Textual Similarity☆12Apr 8, 2020Updated 5 years ago
- Z.Wang & G.Xia, MuseBERT: Pre-training of Music Representation for Music Understanding and Controllable Generation, ISMIR 2021☆47Nov 8, 2021Updated 4 years ago
- TG-CRITIC: A TIMBRE-GUIDED MODEL FOR REFERENCE-INDEPENDENT SINGING EVALUATION☆15May 26, 2023Updated 2 years ago
- Hand Assignment using Neural NeDworkS☆13Jun 17, 2019Updated 6 years ago
- Inference code for Audiodec-Valle-Wenetspeech4TTS☆50Jul 14, 2024Updated last year
- A large-scale speech corpus introduced in Spark-TTS, built from diverse open-source datasets for training text-to-speech (TTS) systems.☆105May 5, 2025Updated 9 months ago
- Chord-Conditioned Melody Harmonization with Controllable Harmonicity [ICASSP 2023]☆47Jul 15, 2023Updated 2 years ago
- Repository for ISMIR 2022 tutorial T3(M): Designing Controllable Synthesis System for Musical Signals☆29Dec 3, 2022Updated 3 years ago
- ☆11Jan 2, 2025Updated last year
- Codebase for 'A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance', ICASSP 2024☆13Oct 4, 2024Updated last year
- a compact audio-to-phoneme aligner for singing voice☆12Jan 17, 2024Updated 2 years ago
- Voice conversion model for real-time speech synthesis using PPG (Phonetic PosteriorGram) as an intermediate feature, written in Pytorch.☆29Mar 3, 2022Updated 3 years ago
- Official code for paper:"Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding"☆33Jan 28, 2026Updated 3 weeks ago
- DALI datasets split used to train models presented in the paper Multilingual lyrics-to-audio alignment (ISMIR 2020).☆13May 25, 2021Updated 4 years ago
- A repository comprising of code for generation of noisy speech data from clean data using deep learning methods☆16Jul 12, 2021Updated 4 years ago