DeepSpectrum / DeepSpectrumLite
Light-weight transfer learning framework for on-device speech and audio recognition using pre-trained image convolutional neural networks.
☆17Updated 2 years ago
Alternatives and similar repositories for DeepSpectrumLite:
Users that are interested in DeepSpectrumLite are comparing it to the libraries listed below
- Streaming Audiotransformers for online Audio tagging☆43Updated 7 months ago
- (Hybrid) BYOL-S feature extractor using serab-byols package in pytorch.☆27Updated 9 months ago
- Implementation for "SoundCLR: Contrastive Learning of Representations For Improved Environmental Sound Classification," in pytorch.☆24Updated last year
- Official implementation of EfficientLEAF, a learnable audio frontend.☆39Updated 2 years ago
- Improving Recording Device Generalization using Impulse Response Augmentation☆11Updated last year
- Source code for ICASSP2022 "Pseudo Strong labels for large scale weakly supervised audio tagging"☆30Updated 2 years ago
- Spectra extraction tutorials based on torch and torchaudio.☆41Updated last year
- Learning differentiable temporal resolution on time-series data.☆35Updated 2 years ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fi…☆36Updated 6 months ago
- ☆30Updated last year
- A speech signal processing library in Python with emphasis on deep learning.☆31Updated 2 years ago
- SERAB: a multi-lingual benchmark for speech emotion recognition☆28Updated 2 years ago
- Code for the paper "Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks".☆13Updated last year
- DUSTED: Spoken-Term Discovery using Discrete Speech Units☆15Updated 3 months ago
- A unified framework for Low-resource Audio Processing and Evaluation (SSL Pre-training and Downstream Fine-tuning)☆27Updated 6 months ago
- The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training☆42Updated last month
- VoViT: Low Latency Graph-based Audio-Visual VoiceSeparation Transformer☆34Updated last year
- Repository for reproducing result in journal "Self-supervised learning for Speech Emotion Recognition"☆10Updated last year
- ☆25Updated 3 years ago
- ☆29Updated 6 months ago
- Towards Intelligibility-Oriented Audio-Visual Speech Enhancement☆14Updated 4 months ago
- SpeechNAS-Better-Trade-off-between-Latency-and-Accuracy-for-Large-Scale-Speaker-Verification☆30Updated last year
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆67Updated 3 years ago
- Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" present…☆25Updated 2 years ago
- Official repo for "A MODULATION-DOMAIN LOSS FOR NEURAL-NETWORK-BASED REAL-TIME SPEECH ENHANCEMENT" to appear in ICASSP 2021☆38Updated 3 years ago
- Adapting a ConvNeXt model to audio classification on AudioSet☆21Updated last year
- A pytorch implementation of the paper : Acoustic Scene Classification with Multiple Decision Schemes.☆20Updated 4 years ago
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- Python toolkit for likelihood-ratio calibration of binary classifiers☆25Updated last year
- Official Pytorch implementation of PULSE: Positive–Unlabelled Learning for audio Signal Enhancement (Best Paper Award at ICASSP 2023)☆41Updated last year