JuanFMontesinos/Acappella-YNet

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/JuanFMontesinos/Acappella-YNet)

JuanFMontesinos / Acappella-YNet

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

☆18

Alternatives and similar repositories for Acappella-YNet

Users that are interested in Acappella-YNet are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JusperLee / Look2hear
View on GitHub
A toolkit for researchers in the multimodal sound separation.
☆16Oct 20, 2023Updated 2 years ago
JuanFMontesinos / torch_mir_eval
View on GitHub
Backpropagable pytorch implementation of https://craffel.github.io/mir_eval/.
☆35Jul 8, 2024Updated 2 years ago
Veleslavia / conditioned-u-net
View on GitHub
Conditioned U-Net for Music Source Separation
☆20May 15, 2021Updated 5 years ago
apmcleod / voice-splitting
View on GitHub
A Java project which is able to split MIDI performance data into monophonic voices.
☆23Aug 26, 2020Updated 5 years ago
JuanFMontesinos / Solos
View on GitHub
Solos: A Dataset for Audio-Visual Music Analysis
☆24Feb 17, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
MiuLab / Lattice-ELMo
View on GitHub
Source code for ACL 2020 paper "Learning Spoken Language Representations with Neural Lattice Language Modeling"
☆18Feb 11, 2023Updated 3 years ago
etzinis / optimal_condition_training
View on GitHub
Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…
☆14Feb 15, 2023Updated 3 years ago
MiuLab / Lattice-Transformer-SLU
View on GitHub
Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"
☆10Jul 8, 2020Updated 6 years ago
ictnlp / FastLongSpeech
View on GitHub
FastLongSpeech is a novel framework designed to extend the capabilities of Large Speech-Language Models for efficient long-speech process…
☆16Jul 22, 2025Updated last year
ductuantruong / speaker_age_estimation_ssl_study
View on GitHub
[APSIPA'22] Exploring Speaker Age Estimation on Different Self-Supervised Learning Models
☆14Oct 19, 2022Updated 3 years ago
kagaminccino / LAVSE
View on GitHub
Python codes for Lite Audio-Visual Speech Enhancement.
☆95May 3, 2024Updated 2 years ago
danmic / av-se
View on GitHub
Deep-Learning-Based Audio-Visual Speech Enhancement and Separation
☆222Apr 16, 2023Updated 3 years ago
exeex / vocoder_eva
View on GitHub
used to evaluate wavenet vocoder by rmse f0, MCD, rmse ap...
☆15Jan 20, 2020Updated 6 years ago
stefan-balke / mpa-exc
View on GitHub
Some Demo Code for the MPA Exercise.
☆10Dec 4, 2017Updated 8 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
csukuangfj / icefall
View on GitHub
☆11Jul 16, 2026Updated last week
vskadandale / instrument-recognition-polyphonic
View on GitHub
Implementations for master thesis "Musical Instrument Recognition in Multi-Instrument Audio Contexts" with MedleyDB.
☆16Apr 4, 2019Updated 7 years ago
WangHelin1997 / LibriLightMix-WHAMR
View on GitHub
Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM
☆17Nov 7, 2024Updated last year
Enescigdem / SignLanguageRecognizer
View on GitHub
☆16Nov 8, 2020Updated 5 years ago
dr-pato / SSGD
View on GitHub
Code of the paper "Low-Latency Speech Separation Guided Diarization for Telephone Conversations"
☆15Dec 22, 2022Updated 3 years ago
Beilong-Tang / TSELM
View on GitHub
Official Implementation of TSELM: Target speaker extraction using discrete tokens and language models
☆60Apr 14, 2025Updated last year
dr-pato / audio_visual_speech_enhancement
View on GitHub
Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments
☆112Mar 19, 2024Updated 2 years ago
Aisaka0v0 / TS-Whisper
View on GitHub
☆33Jun 12, 2025Updated last year
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆13Aug 28, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
RetroCirce / Choral_Music_Separation
View on GitHub
Chorale Music Separation Dataset and Model Framework
☆41Dec 5, 2022Updated 3 years ago
mfischer-ucl / metappearance
View on GitHub
Metappearance: Meta-Learning for Visual Appearance Reproduction
☆22Sep 19, 2022Updated 3 years ago
stoneMo / SLAVC
View on GitHub
Official Codebase of "A Closer Look at Weakly-Supervised Audio-Visual Source Localization" (NeurIPS 2022)
☆22Dec 6, 2022Updated 3 years ago
ronggong / mispronunciation-detection
View on GitHub
Mispronunciation detection code for jingju singing voice
☆19Sep 5, 2018Updated 7 years ago
AlanBaade / MAE-AST-Public
View on GitHub
Public Code for the paper MAE-AST: Masked Autoencoding Audio Spectrogram Transformer
☆93Jun 9, 2022Updated 4 years ago
asteroid-team / pytorch_stoi
View on GitHub
STOI loss functions in PyTorch (mirror of https://github.com/mpariente/pytorch_stoi)
☆15Aug 6, 2020Updated 5 years ago
leto19 / WhiSQA
View on GitHub
Whisper Speech Quality Assessment (WhiSQA)
☆16Apr 14, 2026Updated 3 months ago
yongyizang / SingFake
View on GitHub
Official Repository for "SingFake: Singing Voice Deepfake Detection"
☆64Feb 26, 2024Updated 2 years ago
vskadandale / vocalist
View on GitHub
Official repository for the paper VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
☆73Apr 7, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
desh2608 / gss
View on GitHub
A simple package for Guided source separation (GSS)
☆134May 20, 2024Updated 2 years ago
SDNNetSim / FUSION
View on GitHub
FUSION is an open-source project aimed at revolutionizing networking through the simulation of advanced SD-EONs and AI-enhanced networks,…
☆15Jun 23, 2026Updated last month
AXIHIXA / UGrid
View on GitHub
[ICML 2024] UGrid: An Efficient-And-Rigorous Neural Multigrid Solver for Linear PDEs
☆12Aug 7, 2025Updated 11 months ago
JuanFMontesinos / VoViT
View on GitHub
VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer
☆35Mar 18, 2023Updated 3 years ago
amazon-science / iwslt-autodub-task
View on GitHub
☆21Mar 4, 2024Updated 2 years ago
edemattos / asr
View on GitHub
Automatic Speech Recognition at the University of Edinburgh.
☆16Mar 14, 2021Updated 5 years ago
nii-yamagishilab / ZMM-TTS
View on GitHub
ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
☆185Mar 6, 2024Updated 2 years ago