In this work I investigate the speech command task developing and analyzing deep learning models. The state of the art technology uses convolutional neural networks (CNN) because of their intrinsic nature of learning correlated represen- tations as is the speech. In particular I develop different CNNs trained on the Google Speech Command Dataset…
☆19Jul 18, 2018Updated 7 years ago
Alternatives and similar repositories for learning_invariances_in_speech_recognition
Users that are interested in learning_invariances_in_speech_recognition are comparing it to the libraries listed below
Sorting:
- Text-to-Speech Synthesis by Generating Spectrograms using Generative Adversarial Network☆10Dec 12, 2018Updated 7 years ago
- ☆33Nov 27, 2021Updated 4 years ago
- ☆37Sep 21, 2025Updated 5 months ago
- ☆10Jul 29, 2025Updated 6 months ago
- Manage audio and video datasets☆33Updated this week
- ☆30Jan 13, 2022Updated 4 years ago
- GaugeMeterView is view which can be used in different Meter applications☆12Feb 25, 2022Updated 4 years ago
- Creates CMM script that can directly executed on Kaggle from easy merge script☆14Jan 12, 2026Updated last month
- fast SpecAugmentation code with numpy and scipy☆31Jul 5, 2019Updated 6 years ago
- Machine Learning based model to predict Insurance Pure Premium☆12Jan 24, 2017Updated 9 years ago
- Koel Labs innovates open-source speech research, inclusive speech technologies, and real-time pronunciation feedback for language learner…☆18Updated this week
- Spell correction language model for Uyghur language based on transformer neural network☆14Jun 18, 2025Updated 8 months ago
- UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language☆13Jan 6, 2026Updated last month
- Conversion of Electrocardiography paper records to binarization and converting to digital form in order to extract features to feed in th…☆10Dec 16, 2020Updated 5 years ago
- Using large language models to maintain AI_CHANGELOG.md☆14Jul 15, 2024Updated last year
- Train neural network via pytorch, and run nn model on ESP32☆11Dec 1, 2022Updated 3 years ago
- ☆36Feb 13, 2026Updated last week
- This repository defines a python class that can be used to load data for the tf.keras.model.fit_generator function by using a torch.utils…☆11Oct 26, 2024Updated last year
- The codebase for Data-driven general-purpose voice activity detection.☆93Aug 3, 2023Updated 2 years ago
- Speech Recognition for Uyghur using deep learning☆42Oct 21, 2021Updated 4 years ago
- NLP scripts etc for Radiology projects☆11Jul 22, 2016Updated 9 years ago
- A RESTful API server to control ChatdollKit-based AITuber 💬☆13Jan 14, 2025Updated last year
- A C++ implementation of stft, melspectrogram and mel_to_stft☆10Jun 2, 2022Updated 3 years ago
- msglm makes it a little easier to create messages for language models like Claude and OpenAI GPTs.☆14Jan 29, 2026Updated 3 weeks ago
- there are UKIJ and Uighursoft fonts☆13Oct 21, 2022Updated 3 years ago
- Framework for Deep Speech Recognition☆11Nov 22, 2022Updated 3 years ago
- uyghur text resource crawled from website☆12Dec 25, 2015Updated 10 years ago
- Make N-Gram for Uyghur language☆15Dec 24, 2020Updated 5 years ago
- Acoustic event detection using recurrent neural networks.☆11Sep 4, 2018Updated 7 years ago
- ☆12Aug 30, 2017Updated 8 years ago
- WhatsApp statistics toolkit mirror☆10Mar 24, 2019Updated 6 years ago
- ☆10Apr 8, 2024Updated last year
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- AzukiはC# 2.0で書かれたフリーのテキストエディタエンジンです。オリジナル版を github で fork して拡張版を作成しています。☆11Feb 26, 2023Updated 2 years ago
- A simple document and image search engine implemented in keras☆11Feb 22, 2018Updated 8 years ago
- MusicYOLO framework uses the object detection model, YOLOx, to locate notes in the spectrogram.☆11Jan 29, 2022Updated 4 years ago
- An example AWS SAM app showing how to deploy a fastai app using Lambda Container feature☆13Dec 6, 2020Updated 5 years ago
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆13Dec 4, 2024Updated last year
- code for paper "learning to fool the speaker recognition"☆10Jun 12, 2020Updated 5 years ago