WangHelin1997/nnAudio2

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WangHelin1997/nnAudio2)

WangHelin1997 / nnAudio2

Audio processing by using pytorch 1D convolution network (based on nnAudio). Gammatone Spectrogram and SpecAugmentation are now available on GPU.

☆21

Alternatives and similar repositories for nnAudio2

Users that are interested in nnAudio2 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WangHelin1997 / DCASE-2020-Task1A-Code
View on GitHub
A pytorch implementation of the paper : Acoustic Scene Classification with Multiple Decision Schemes.
☆20Dec 12, 2020Updated 5 years ago
WangHelin1997 / AT-GCN
View on GitHub
Pytorch implementation of the paper : Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network
☆14Sep 18, 2020Updated 5 years ago
mcusi / gammatonegram
View on GitHub
Python version of http://www.ee.columbia.edu/ln/rosa/matlab/gammatonegram/
☆15Oct 15, 2018Updated 7 years ago
wilkinghoff / sub-cluster-AdaCos
View on GitHub
Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.
☆11Jun 7, 2022Updated 4 years ago
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lukewys / dcase_2020_T6
View on GitHub
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning…
☆24Aug 3, 2023Updated 2 years ago
qiuqiangkong / dcase2019_task1
View on GitHub
☆20May 13, 2019Updated 7 years ago
KinWaiCheuk / nnAudio
View on GitHub
Audio processing by using pytorch 1D convolution network
☆1,129May 21, 2026Updated 2 months ago
tencentmusic / TME-Audio-Super-Resolution-Samples
View on GitHub
Audio samples for the paper 'Phase-aware music super-resolution using generative adversarial networks'
☆14May 15, 2020Updated 6 years ago
qiuqiangkong / sampleRNN_acoustic_scene_generation
View on GitHub
☆14Apr 18, 2019Updated 7 years ago
sevagh / nsgt
View on GitHub
PyTorch implementation of the NSGT/sliCQT
☆17Nov 10, 2023Updated 2 years ago
KinWaiCheuk / pytorch_template
View on GitHub
Template that combines PyTorch Lightning and Hydra
☆16Aug 15, 2023Updated 2 years ago
haoxiangsnr / llm-tse
View on GitHub
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
☆43Oct 13, 2023Updated 2 years ago
deepakbaby / se_relativisticgan
View on GitHub
Keras framework for speech enhancement using relativistic GANs
☆52Jun 24, 2020Updated 6 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
mdx-tutorial / mdx-tutorial.github.io
View on GitHub
Tutorial covering Open Source tools for Source Separation.
☆15Nov 12, 2021Updated 4 years ago
denfed / wave-spec-fusion
View on GitHub
Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…
☆16Aug 9, 2021Updated 4 years ago
Yip-Jia-Qi / codecformer
View on GitHub
☆21Jul 15, 2024Updated 2 years ago
JunyiPeng00 / SLT22_MultiHead-Factorized-Attentive-Pooling
View on GitHub
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
☆24Sep 22, 2024Updated last year
sainathadapa / dcase2019-task5-urban-sound-tagging
View on GitHub
1st place solution to the DCASE 2019 - Task 5 - Urban Sound Tagging
☆30Mar 19, 2021Updated 5 years ago
HolgerBovbjerg / SSL-PVAD
View on GitHub
A repository for code used to produce the results the ICASSP 2024 paper: "SELF-SUPERVISED PRETRAINING FOR ROBUST PERSONALIZED VOICE ACTIV…
☆25Nov 25, 2024Updated last year
AdvSV / AdvSV.github.io
View on GitHub
AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…
☆11Nov 21, 2023Updated 2 years ago
ssrp / SubSpectralNet-PyTorch
View on GitHub
PyTorch Implementation of SubSpectralNet - Using Sub-Spectrogram based Convolutional Neural Networks for Acoustic Scene Classification, a…
☆21Feb 20, 2019Updated 7 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ZhihaoDU / du2022sond
View on GitHub
Speaker overlap-aware Neural Diarization
☆12Feb 13, 2023Updated 3 years ago
leo-so / VocalMelodyExtPatchCNN
View on GitHub
Vocal melody extraction using patch-based CNN
☆32Feb 5, 2018Updated 8 years ago
XinhaoMei / DCASE2021_task6_v2
View on GitHub
Code for CVSSP submission to DCASE 2021 Task 6
☆36Nov 22, 2022Updated 3 years ago
yjlolo / dSEQ-VAE
View on GitHub
BAD-VAE: A VAE framework for unsupervised disentanglement of sequential data
☆12May 25, 2022Updated 4 years ago
asteroid-team / asteroid-filterbanks
View on GitHub
Asteroid's filterbanks
☆90Jan 12, 2025Updated last year
qiuqiangkong / sound_event_detection_dcase2017_task4
View on GitHub
☆55Jun 3, 2020Updated 6 years ago
MTG / Podcastmix
View on GitHub
PodcastMix A dataset for separating music and speech in podcasts.
☆44Aug 20, 2024Updated last year
keunwoochoi / ismir-2019-posters
View on GitHub
☆75Jan 6, 2020Updated 6 years ago
yoyolicoris / torch-fftconv
View on GitHub
Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.
☆15May 18, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
cpuimage / EqualLoudness
View on GitHub
Equal Loudness Filter
☆11Mar 4, 2019Updated 7 years ago
jmpolom / sti-wav
View on GitHub
Speech Transmission Index (STI) from real speech waveforms
☆15May 1, 2011Updated 15 years ago
yufenhuang / Guqin-dataset
View on GitHub
Guqin performance analysis
☆12Aug 31, 2020Updated 5 years ago
bootphon / learnable-strf
View on GitHub
Learnable STRF, from Riad et al. 2021 JASA
☆13Aug 21, 2021Updated 4 years ago
KinWaiCheuk / pytorch_musicnet
View on GitHub
Complete implementation of MusicNet in Pytorch
☆12Apr 15, 2020Updated 6 years ago
yoyolicoris / spectrogram-inversion
View on GitHub
spectrogram inversion tools in PyTorch. Documentation: https://spectrogram-inversion.readthedocs.io
☆51Jun 12, 2025Updated last year
ssrp / SubSpectralNet
View on GitHub
SubSpectralNet - Using Sub-Spectrogram based Convolutional Neural Networks for Acoustic Scene Classification, accepted in ICASSP 2019
☆18Feb 20, 2019Updated 7 years ago