YanZiBuGuiCHunShiWan / RESTFUL_ASR
基于wenet的短时在线语音识别服务
☆11Updated 2 years ago
Alternatives and similar repositories for RESTFUL_ASR:
Users that are interested in RESTFUL_ASR are comparing it to the libraries listed below
- A pytorch template for beginners based on pytorch_lightning☆43Updated last year
- Official implement of "Dual-stream Time-Delay Neural Network with Dynamic Global Filter for Speaker Verification" in PyTorch☆40Updated last year
- Code for Audio-Visual Target Speaker Extraction with Selective Auditory Attention (TASLP)☆16Updated last month
- Official code release for "TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion", accepted ICIST 2023☆12Updated last year
- ☆35Updated 9 months ago
- Code for INTERSPEECH 2023 paper "mdctGAN: Taming transformer-based GAN for speech super-resolution with Modified DCT spectra"☆60Updated last year
- TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages☆14Updated 11 months ago
- Models and codes for INTERSPEECH 2023 paper DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model☆11Updated 3 weeks ago
- ☆25Updated last year
- [ACM MM24] Official implementation of paper "From Speaker to Dubber: Movie Dubbing with Prosody and Duration Consistency Learning"☆26Updated 3 months ago
- Training code for MaskGCT-T2S model.☆19Updated 4 months ago
- paper for Anomalous sound detection☆18Updated last month
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆74Updated 5 months ago
- llama-omni训练代码复现☆60Updated 3 months ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆62Updated this week
- It includes papers on speech&audio field. Now update: ICLR2023-2025, ICML2023-2024, NeurIPS2023-2024, ACMMM2024, AAAI2024, ACL2024, EMNLP…☆49Updated this week
- A Lightweight One-Shot Whisper to Normal Voice Conversion Model Using Distillation of Self-Supervised Features☆11Updated last week
- Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations☆47Updated 3 months ago
- ☆9Updated 4 months ago
- AD-TUNING: An Adaptive CHILD-TUNING Approach to Efficient Hyperparameter Optimization of Child Networks for Speech Processing Tasks in th…☆11Updated last year
- The official project website of "KernelWarehouse: Rethinking the Design of Dynamic Convolution" (KW for short, published in ICML 2024)☆100Updated 10 months ago
- ☆70Updated last year
- Official repository for the paper "xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement"☆29Updated last month
- Official repository for Mamba-based Segmentation Model for Speaker Diarization☆35Updated 6 months ago
- ☆26Updated last week
- baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift☆17Updated last year
- official implementation of MGA-CLAP (ACM MM 2024)☆14Updated 6 months ago
- [INTERSPEECH 2023] Knowledge Transfer from Pre-trained Language Models to Cif-based Recognizers via Hierarchical Distillation☆39Updated last year
- SSL Layerwise analysis for speech deepfake detection☆22Updated 2 months ago
- Offline Speaker Diarization with SenseVoice by Sherpa ONNX.☆12Updated 4 months ago