This repository documents Barry's journey in learning deep learning for speech processing. Here, you'll find scripts and code snippets related to environment setup, data preprocessing, speech frontend, speech recognition, voice conversion, speech synthesis, and more. Let's explore the fascinating world of speech processing together! 🚀🚀🚀
☆13Oct 8, 2025Updated 6 months ago
Alternatives and similar repositories for barry_speech_tools
Users that are interested in barry_speech_tools are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Chinese Expressive Long-dialogue Speech Dataset with Scripts☆21Nov 11, 2024Updated last year
- ☆27Sep 14, 2024Updated last year
- 中国科学院大学2023-2024课程(更新中)☆12Jan 12, 2026Updated 3 months ago
- VOICOR: A Residual Iterative Voice Correction Framework for Monaural Speech Enhancement☆46Sep 12, 2024Updated last year
- The baselines of ARC-Challenge-Interspeech2026☆58Dec 1, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Apr 26, 2025Updated last year
- ☆19Aug 23, 2024Updated last year
- Hierarchical Vision Transformers for Disease Progression Detection in Chest X-Ray Images☆11Jan 11, 2024Updated 2 years ago
- ☆16Jun 15, 2022Updated 3 years ago
- This is the implementation of the manuscript "Learning General All-Neural Speech Enhancement based on Taylor's Approximation Theory", whi…☆14Nov 25, 2022Updated 3 years ago
- The implementation of TaylorBeamformer, which is in submission to Interspeech2022☆48Jun 10, 2022Updated 3 years ago
- Fairness-Aware Representation Learning by Suppressing Attribute-Class Associations☆13Mar 19, 2026Updated last month
- An interpreter in C for the language brainfuck.☆11Apr 12, 2023Updated 3 years ago
- This repository contains code for an acoustic simulation framework that can be used for acoustic/ultrasonic indoor positioning and/or dat…☆13May 7, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Greifswald Sleep Stage Classifier - a deep-learning based EEG sleep stage classifier☆16Aug 22, 2025Updated 8 months ago
- TDBRAIN EEG Database pre-processing code☆17May 8, 2024Updated last year
- ☆16Dec 22, 2023Updated 2 years ago
- Ultra-fast audio super resolution custom node for ComfyUI, powered by the NovaSR model.☆30Feb 12, 2026Updated 2 months ago
- ☆15Sep 16, 2024Updated last year
- CS336 作业 5 实现, 附加作业里面的 dpo/rlhf 也完成了, 消融实验分析也放在飞书文档里面了, 仅供参考☆33Sep 27, 2025Updated 7 months ago
- This is the official implement of Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement☆92May 26, 2025Updated 11 months ago
- ACM MM 2022 paper_AVQA: A Dataset for Audio-Visual Question Answering on Videos☆15Aug 17, 2023Updated 2 years ago
- ☆24Feb 28, 2023Updated 3 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- SLT 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge☆12Jun 11, 2024Updated last year
- ☆16Nov 6, 2023Updated 2 years ago
- MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations☆38Oct 15, 2025Updated 6 months ago
- A collection of tools to improve TJUer's life experience☆19Feb 29, 2024Updated 2 years ago
- speech enhancement\speech seperation\sound source localization☆15Apr 22, 2020Updated 6 years ago
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆13Oct 31, 2024Updated last year
- ☆25Sep 30, 2019Updated 6 years ago
- 封装了百度、捷通华声和讯飞语音识别的库,以及捷通华声、民族语文翻译、小牛翻译的封装。☆15Sep 10, 2019Updated 6 years ago
- ☆26May 5, 2025Updated 11 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- This is a repository for fine-tuning Qwen2-Audio, currently supporting Distributed Data Parallel (DDP) and DeepSpeed.☆52Jul 28, 2025Updated 9 months ago
- ☆11Feb 14, 2025Updated last year
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech und…☆43Mar 12, 2023Updated 3 years ago
- arxiv翻译修复器!☆22Nov 13, 2024Updated last year
- Training Transformers with knowledge localization (SGTM)☆51Jan 11, 2026Updated 3 months ago
- Some useful tools☆20Nov 28, 2019Updated 6 years ago
- Official repository for LMFCA-Net: A Lightweight Model for Multi-Channel Speech Enhancement with Efficient Narrow-Band and Cross-Band Att…☆29Feb 26, 2025Updated last year