guxm2021/ALT_SpeechBrain

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/guxm2021/ALT_SpeechBrain)

guxm2021 / ALT_SpeechBrain

[ISMIR 2022] Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription

☆51

Alternatives and similar repositories for ALT_SpeechBrain

Users that are interested in ALT_SpeechBrain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

guxm2021 / MM_ALT
View on GitHub
[MM 2022] MM-ALT: A Multimodal Automatic Lyric Transcription System (Oral, Top paper award)
☆21Mar 16, 2024Updated 2 years ago
f90 / jamendolyrics
View on GitHub
DEPRECATED: Jamendo music dataset with time-aligned lyrics for lyrics alignment evaluation
☆88Apr 30, 2025Updated last year
jhuang448 / E2E-LyricsAlignment-Implementation
View on GitHub
Implementation of paper "End-to-end lyrics alignment for polyphonic music using an audio-to-character recognition model"
☆18Nov 20, 2022Updated 3 years ago
emirdemirel / DALI-TestSet4ALT
View on GitHub
This is a subset of the DALI set consisting of 240 polyphonic recordings that is used to benchmark lyrics transcription evaluation.
☆12Nov 30, 2021Updated 4 years ago
emirdemirel / ALTA
View on GitHub
A complete training recipe for kaldi-based Automatic Lyrics Transcription.
☆32Nov 30, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kwatcharasupat / musdb25
View on GitHub
MUSDB25 - A Fully Multitrack Dataset for Music Source Separation
☆13Mar 29, 2025Updated last year
zerospeech / zerospeech2021
View on GitHub
Zerospeech Challenge 2021: validation and evaluation software
☆12Jun 13, 2022Updated 4 years ago
groadabike / Kaldi-Dsing-task
View on GitHub
DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.
☆19Jul 9, 2026Updated 2 weeks ago
guxm2021 / SVT_SpeechBrain
View on GitHub
[TOMM 2024] Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing
☆28Aug 30, 2024Updated last year
W-Wu / DEER
View on GitHub
☆12Aug 25, 2023Updated 2 years ago
rupakvignesh / Lyrics-to-Audio-Alignment
View on GitHub
Aligns text (lyrics) with monophonic singing voice (audio). The algorithm uses structural segmentation to segment the audio into structur…
☆94Feb 13, 2018Updated 8 years ago
wei-zeng98 / piano-a2s
View on GitHub
End-to-end real-world polyphonic piano audio-to-score transcription with hierarchical decoding (IJCAI 2024)
☆41Sep 17, 2024Updated last year
Itachi6912110 / Hierarchical-Note-Segmentation
View on GitHub
Realization for note segmentation by using hierarchical objective function
☆14Jun 26, 2019Updated 7 years ago
jwr1995 / WD-TCN
View on GitHub
☆11Aug 5, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
SwagLyrics / autosynch
View on GitHub
Automated lyrics-to-audio alignment using syllabic nuclei detection. Developed during Google Summer of Code 2019.
☆53Jul 6, 2023Updated 3 years ago
kyungyunlee / ismir2018-revisiting-svd
View on GitHub
Revisiting Singing Voice Detection : a Quantitative Review and the Future Outlook
☆69Nov 21, 2022Updated 3 years ago
schufo / lyrics-aligner
View on GitHub
Automatic lyrics alignment at phoneme or word level with a pre-trained deep neural network.
☆41Aug 21, 2023Updated 2 years ago
Rongjiehuang / awesome-speech-to-speech-translation
View on GitHub
List of direct speech-to-speech translation papers.
☆39Jan 31, 2023Updated 3 years ago
RickyL-2000 / AlignSTS
View on GitHub
Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment
☆68Jul 5, 2024Updated 2 years ago
itec-hust / MusicYOLO
View on GitHub
MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.
☆18Jan 29, 2022Updated 4 years ago
dhchoi99 / NANSY
View on GitHub
☆171Jul 25, 2022Updated 4 years ago
KanikeSaiPrakash / Speech-Emotion-Recognition
View on GitHub
Speech Emotion Recognition using Deep Learning
☆13May 24, 2021Updated 5 years ago
JoungheeKim / K-wav2vec
View on GitHub
☆87Dec 21, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
PlayVoice / VI-Speaker
View on GitHub
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
☆30Sep 16, 2022Updated 3 years ago
hcy71o / TransferTTS
View on GitHub
TransferTTS (Zero-Shot learning of VITS)
☆102Sep 23, 2022Updated 3 years ago
ashispati / GuitarSoloDetection
View on GitHub
Code accompanying AES Semantic Audio Conference paper titled "A Dataset and Method for Guitar Solo Detection in Rock Music"
☆11Jan 18, 2018Updated 8 years ago
york135 / singing_transcription_ICASSP2021
View on GitHub
The source code and pre-trained model of the paper "On the Preparation and Validation of a Large-scale Dataset"
☆70Mar 5, 2026Updated 4 months ago
hcy71o / SC-CNN
View on GitHub
SC-CNN: Effective Speaker Conditioning Method for Zero-Shot Multi-Speaker Text-to-Speech Systems
☆39Nov 1, 2023Updated 2 years ago
DeNA / Face2Speech
View on GitHub
☆20Mar 16, 2020Updated 6 years ago
patrickltobing / cyclevae-vc-neuralvoco
View on GitHub
☆91Sep 24, 2021Updated 4 years ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
SonyCSLParis / vqcpc-gan
View on GitHub
VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding
☆14Apr 27, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
RickyL-2000 / ROSVOT
View on GitHub
Robust Singing Voice Transcription and MIDI Extraction
☆123Nov 20, 2024Updated last year
rgzn-aiyun / melgan-cpu
View on GitHub
Real-time melgan based on cpu ！！！
☆13Dec 3, 2019Updated 6 years ago
Bartelds / neural-acoustic-distance
View on GitHub
Code associated with the paper: Neural Representations for Modeling Variation in Speech.
☆18Mar 10, 2022Updated 4 years ago
SeanPLeary / dc_tts-transfer-learning
View on GitHub
Transfer learning exploration of dc_tts text-to-speech model
☆21Mar 5, 2019Updated 7 years ago
HaoranMiao / streaming-attention
View on GitHub
streaming attention networks for end-to-end automatic speech recognition
☆56May 6, 2020Updated 6 years ago
Bose / RAVEN
View on GitHub
☆20Oct 6, 2025Updated 9 months ago
DanielMengLiu / AudioVisualLip
View on GitHub
☆25Feb 20, 2024Updated 2 years ago