Nitin4525 / SpeechEnhancement
Combining Weighted Multi-resolution STFT Loss and Distance Fusion to Optimize Speech Enhancement Generative Adversarial Networks
☆58Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for SpeechEnhancement
- This is the demo of our paper "IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation".☆111Updated 5 months ago
- Single-blind supplementary materials for NeurIPS 2023 submission☆93Updated last week
- Code for paper "Noise-aware Speech Enhancement using Diffusion Probabilistic Model"☆84Updated 5 months ago
- Voice-Face Association Learning Evaluation☆54Updated 8 months ago
- ☆191Updated last week
- When doing audio and video sentiment recognition, I found that a lot of code is duplicated, often a function in different time debugging …☆39Updated 3 years ago
- ☆33Updated 2 years ago
- ICANN‘2021: Multi-Modal Chorus Recognition for Improving Song Search☆28Updated 3 years ago
- ☆50Updated 2 years ago
- The PyTorch implementation of our Pattern Recognition 2022 paper, FocusNet, on ILSVRC2012☆45Updated 2 years ago
- Pre-train support for OpenNMT (PNMT)☆36Updated 2 years ago
- Pytorch implementation of "MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba"☆46Updated 3 months ago
- Pytorch implement of the paper "VLDeformer: Vision Language Decomposed Transformer for Fast Cross-modal Retrieval", KBS 2022☆26Updated 2 years ago
- ☆34Updated 4 years ago
- Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models☆209Updated 2 months ago
- Code for paper "Large Language Models are Efficient Learners of Noise-Robust Speech Recognition"☆155Updated 6 months ago
- CATA stroage. Design based on Flow Blockchain, with fast, secure, and developer-friendly feature. Support the next generation of games, a…☆30Updated last year
- ☆27Updated 4 years ago
- TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models (2024 ICASSP)☆137Updated 2 months ago
- ☆25Updated last year
- ☆21Updated last week
- Kaggle M5 competition, ranked 292/5589 (top 5%).☆10Updated 2 years ago
- Software Architect Roadmap(updating...)☆19Updated last year
- https://blog.csdn.net/Shockang/article/details/120599137☆31Updated 2 years ago
- An AI digital human real-time streaming video voice call project, including picture and voice input and picture voice output☆43Updated last week
- Using different CNN models to train on GTZAN Dataset☆50Updated 11 months ago