denfed/wave-spec-fusion

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/denfed/wave-spec-fusion)

denfed / wave-spec-fusion

Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal Feature Fusion"

☆16

Alternatives and similar repositories for wave-spec-fusion

Users that are interested in wave-spec-fusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

CVxTz / COLA_pytorch
View on GitHub
COLA contrastive pre-training method implemented in PyTorch
☆44Jan 27, 2021Updated 5 years ago
rfalcon100 / seld_dcase2022_ric
View on GitHub
My system for the DCASE 2022 Task 3 Sound Event Localizaiton and Detection.
☆12Nov 12, 2022Updated 3 years ago
colaudiolab / AudioSet-R
View on GitHub
Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"
☆19Oct 9, 2025Updated 9 months ago
marmoi / dcase2021_task1a_baseline
View on GitHub
☆14Jun 9, 2021Updated 5 years ago
WangHelin1997 / GL-AT
View on GitHub
Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.
☆13Feb 6, 2021Updated 5 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
haoheliu / DCASE_2022_Task_5
View on GitHub
System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection
☆28Jul 6, 2022Updated 4 years ago
haoheliu / diffres-python
View on GitHub
Learning differentiable temporal resolution on time-series data.
☆36Nov 12, 2022Updated 3 years ago
thelahunginjeet / pyica
View on GitHub
python code for Independent Component Analysis
☆14Jan 8, 2018Updated 8 years ago
lmaxwell / McHuo
View on GitHub
A chinese singing voice dataset, professional male singer, 105 songs, 132 minutes
☆12Oct 19, 2023Updated 2 years ago
Sreyan88 / RECAP
View on GitHub
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
☆16Jun 23, 2024Updated 2 years ago
chaufanglin / Normal2Whisper
View on GitHub
Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"
☆14Oct 31, 2024Updated last year
leftthomas / SimSiam
View on GitHub
A PyTorch implementation of SimSiam based on CVPR 2021 paper "Exploring Simple Siamese Representation Learning"
☆12Mar 23, 2021Updated 5 years ago
denfed / leaf-audio-pytorch
View on GitHub
Pytorch port of Google Research's LEAF Audio paper
☆91May 19, 2021Updated 5 years ago
reppy4620 / convnext_tts
View on GitHub
Unofficial implementation of ConvNeXt-TTS powered by lightning
☆18Oct 20, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
SarthakYadav / leaf-pytorch
View on GitHub
PyTorch implementation of the LEAF audio frontend
☆79Mar 29, 2023Updated 3 years ago
topel / audioset-convnext-inf
View on GitHub
Adapting a ConvNeXt model to audio classification on AudioSet
☆27Feb 19, 2025Updated last year
sony / bigvsan_eval
View on GitHub
Evaluation tool used in the BigVSAN paper
☆14Mar 22, 2024Updated 2 years ago
bellchenx / AudioFolder-Dataloader-PyTorch
View on GitHub
This sample includes simeple CNN classifier for music and audio-folder dataloader just like ImageFolder in torchvision.
☆11Oct 30, 2018Updated 7 years ago
facebookresearch / learning-audio-visual-dereverberation
View on GitHub
Code for paper Learning Audio-Visual Dereverberation
☆32Aug 10, 2022Updated 3 years ago
DeepSpectrum / DeepSpectrumLite
View on GitHub
Light-weight transfer learning framework for on-device speech and audio recognition using pre-trained image convolutional neural networks…
☆18Apr 16, 2022Updated 4 years ago
aminul-huq / Speech-Command-Classification
View on GitHub
Speech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been train…
☆10Dec 5, 2022Updated 3 years ago
kuielab / voice_datasets
View on GitHub
🔊 A comprehensive list of open-source datasets for voice and sound computing (50+ datasets).
☆20Apr 1, 2021Updated 5 years ago
Hadryan / TFNet-for-Environmental-Sound-Classification
View on GitHub
Learning discriminative and robust time-frequency representations for environmental sound classification: Convolutional neural networks (…
☆31Dec 19, 2019Updated 6 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
mt-upc / ZeroSwot
View on GitHub
Pushing the Limits of Zero-shot End-to-End Speech Translation
☆25Dec 12, 2024Updated last year
RonFrancesca / dcase2020-fp
View on GitHub
☆10Jun 26, 2020Updated 6 years ago
wushaowu2014 / 2019-iflytek-competition-Alzheimer-s-disease-prediction
View on GitHub
2019科大讯飞阿尔茨海默综合症预测挑战赛baseline
☆12Jul 12, 2019Updated 7 years ago
JeongHun0716 / e-mvsr
View on GitHub
Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)
☆20Mar 17, 2025Updated last year
philgzl / brever
View on GitHub
Speech enhancement in noisy and reverberant environments using deep neural networks
☆23Oct 10, 2025Updated 9 months ago
stoneMo / OneAVM
View on GitHub
Official Codebase of "A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition" (ICML 2023)
☆12Jun 1, 2023Updated 3 years ago
vincenzodentamaro / aucoresnet
View on GitHub
AUCO ResNet: an end-to-end network for Covid-19 pre-screening from cough and breath
☆13Mar 18, 2022Updated 4 years ago
jackgle / YAMNet-transfer-learning
View on GitHub
Transfer learning and fine-tuning with YAMNet
☆21Jan 20, 2026Updated 6 months ago
LingyiChen-AI / multi-chart-draw-skills
View on GitHub
支持多种图表类型的绘制工具，包括思维导图、流程图、数据可视化图表、数学函数图等；可根据用户需求生成 Mermaid、ECharts、Mindmap、DrawIO、GeoGebra 等格式的图表，并导出为 PNG、SVG、HTML 等格式
☆17Jan 26, 2026Updated 6 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
sevagh / Real-Time-HPSS
View on GitHub
MATLAB + Python implementations of real-time median-filtering Harmonic-Percussive Source Separation
☆22Sep 9, 2021Updated 4 years ago
qiuqiangkong / dcase2019_task1
View on GitHub
☆20May 13, 2019Updated 7 years ago
WangHelin1997 / Automatic_Speech_Annotator
View on GitHub
Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automat…
☆33Jun 14, 2024Updated 2 years ago
ydqmkkx / Respiro-en
View on GitHub
Official implementation of paper: Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-…
☆44Sep 18, 2024Updated last year
JunhoKim94 / ASR_project
View on GitHub
This repository created for the NHN ASR hackathon competition.
☆11Sep 20, 2023Updated 2 years ago
WangHelin1997 / nnAudio2
View on GitHub
Audio processing by using pytorch 1D convolution network (based on nnAudio). Gammatone Spectrogram and SpecAugmentation are now available…
☆21Nov 30, 2020Updated 5 years ago
NiliRahmani / Alzheimer-s-Dementia-Recognition-through-Spontaneous-Speech
View on GitHub
Alzheimer's Dementia Recognition through Spontaneous Speech The ADReSSo Challenge
☆15Aug 6, 2023Updated 2 years ago