tuanio / nextformer
PyTorch implementation of "Nextformer: A ConvNeXt Augmented Conformer For End-To-End Speech Recognition"
☆11Updated last year
Related projects: ⓘ
- Efficient Personalized Speech Enhancement through Self-Supervised Learning☆21Updated last year
- This is a public repository for RATS Channel-A Speech Data, which is a chargeable noisy speech dataset under LDC. Here we release its Log…☆11Updated last year
- ☆13Updated 10 months ago
- A small tool to calculate the distribution of audio durations in a directory☆13Updated last year
- A STFT/iSTFT written up in PyTorch using 1D Convolutions☆24Updated 2 months ago
- ☆25Updated 3 months ago
- We design a spectral compression mapping (SCM) for full-band speech enhancement, and propose a two-stage stream named MHA-DPCRN☆20Updated 2 years ago
- Dynamic Mixing For Speech Processing (mix-on-the-fly)☆13Updated 2 years ago
- ☆15Updated last year
- ☆24Updated last year
- Dataset simulation for DPCCN.☆14Updated last year
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆14Updated 10 months ago
- Source Code for the Paper "UNIFIED KEYWORD SPOTTING AND AUDIO TAGGING ON MOBILE DEVICES WITH TRANSFORMERS"☆23Updated last year
- ☆14Updated 10 months ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆23Updated last month
- A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification☆31Updated 3 years ago
- ☆9Updated last year
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Updated last year
- Source code for "BLOOM-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement"☆12Updated 2 years ago
- ☆13Updated 4 months ago
- TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings☆15Updated last month
- ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation☆20Updated 6 months ago
- Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch☆28Updated 2 years ago
- System that ranks 2nd in DCASE 2022 Challenge Task 5: Few-shot Bioacoustic Event Detection☆27Updated 2 years ago
- A toolkit for researchers in the multimodal sound separation.☆16Updated 11 months ago
- ☆35Updated 4 months ago
- ☆13Updated 2 months ago
- ☆25Updated last year
- ☆13Updated 11 months ago
- Official code for MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enh…☆25Updated 2 months ago