Light-weight transfer learning framework for on-device speech and audio recognition using pre-trained image convolutional neural networks.
☆18Apr 16, 2022Updated 3 years ago
Alternatives and similar repositories for DeepSpectrumLite
Users that are interested in DeepSpectrumLite are comparing it to the libraries listed below
Sorting:
- ☆138Aug 29, 2024Updated last year
- ☆29Mar 8, 2022Updated 4 years ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆13Mar 30, 2025Updated 11 months ago
- Target speaker automatic speech recognition (TS-ASR)☆12Oct 14, 2023Updated 2 years ago
- kaldi cnn-tdnnf baseline☆13Aug 31, 2021Updated 4 years ago
- The official implementation of DMEL the method presented in the paper "DMEL: The differentiable log-Mel spectrogram as a trainable layer …☆22Dec 21, 2024Updated last year
- RespireNet is an innovative web-based application that harnesses the capabilities of deep learning and Mel-frequency cepstral coefficient…☆10Aug 2, 2023Updated 2 years ago
- Getting confidences from any end-to-end systems☆11May 24, 2023Updated 2 years ago
- Code for the submitted 2021 DCASE Workshop paper: "Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal F…☆16Aug 9, 2021Updated 4 years ago
- [ACII 2023] PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Spe…☆60Jul 1, 2024Updated last year
- ☆17Aug 10, 2021Updated 4 years ago
- recent audio generation papers (including speech, music and general audios)☆13Mar 14, 2023Updated 3 years ago
- Code for the paper "FastAdaSP: An Efficient Multitask Inference Framework for Large Speech Language Models". @ EMNLP'24(Oral)☆17Nov 14, 2024Updated last year
- ☆15Jul 4, 2024Updated last year
- Tensorflow Implementation for "Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition"☆10Dec 19, 2021Updated 4 years ago
- ☆17Jul 22, 2024Updated last year
- Depression-Detection represents a machine learning algorithm to classify audio using acoustic features in human speech, thus detecting de…☆14Jul 10, 2020Updated 5 years ago
- MATLAB + Python implementations of real-time median-filtering Harmonic-Percussive Source Separation☆21Sep 9, 2021Updated 4 years ago
- Perform three types of feature extraction: STFT, MFCC and MelSpectrogram. Apply CNN/VGG with or without RNN architecture. Able to achieve…☆15Jun 28, 2020Updated 5 years ago
- ☆21Sep 2, 2020Updated 5 years ago
- MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.☆22Apr 8, 2021Updated 4 years ago
- This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…☆16Oct 22, 2022Updated 3 years ago
- Automatic Speech Recognition (ASR) system for the Samrómur speech corpus using Kaldi☆12Sep 30, 2022Updated 3 years ago
- ☆13Jan 14, 2025Updated last year
- Speech emotion recognition using LSTM, SVM and MLP | 语音情感识别☆10Jul 1, 2019Updated 6 years ago
- ☆19Mar 2, 2024Updated 2 years ago
- Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.☆24Feb 25, 2025Updated last year
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Dec 16, 2022Updated 3 years ago
- ☆23Jun 24, 2024Updated last year
- Song Plays Workshop Tutorial☆13Nov 19, 2020Updated 5 years ago
- Cough detection with Log Mel Spectrogram, Wavelet Transform, Deep learning and Transfer learning techniques☆17Dec 12, 2020Updated 5 years ago
- Non-parallel voice conversion called ICRCycleGAN-VC based on CycleGAN and Inception-resNet module by Afiuny☆15Oct 30, 2025Updated 4 months ago
- ☆64Jun 28, 2023Updated 2 years ago
- ☆11Oct 20, 2022Updated 3 years ago
- ☆14Mar 24, 2023Updated 2 years ago
- Implementation of Google's USM speech model in Pytorch☆35Feb 7, 2026Updated last month
- Official repository for the paper "Audio xLSTMs: Learning Self-supervised audio representations with xLSTMs"☆21Sep 7, 2025Updated 6 months ago
- 80s FM video game music dataset (ISMIR 2022)☆26Jan 10, 2023Updated 3 years ago
- Detecting depressed Patient based on Speech Activity, Pauses in Speech and Using Deep learning Approach☆20Jan 5, 2023Updated 3 years ago