Mispronunciation Detection using a pretrained and finetuned wav2vec2 model for phoneme recognition and diagnosis and feedback using large language model
☆52May 6, 2024Updated last year
Alternatives and similar repositories for mispronunciation-detection-diagnosis-wav2vec2-and-llm
Users that are interested in mispronunciation-detection-diagnosis-wav2vec2-and-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains all the codes used in a thesis at Information Technology University (ITU). The topic of the thesis is pronunciat…☆26Jun 25, 2019Updated 6 years ago
- [Interspeech22]Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Ass…☆35Jan 23, 2024Updated 2 years ago
- This repo related to the paper "A Framework for Phoneme-Level Pronunciation Assessment Using CTC" for INTERSPEECH2024☆38Feb 5, 2026Updated 2 months ago
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features☆10Oct 5, 2022Updated 3 years ago
- ☆28Nov 7, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Goodness of Pronunciation (GOP) for oral reading assessment.☆54Nov 17, 2021Updated 4 years ago
- Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".☆202Feb 13, 2023Updated 3 years ago
- ☆19Jun 28, 2022Updated 3 years ago
- Goodness of Pronunciation algorithm using PyKaldi☆18Jun 12, 2022Updated 3 years ago
- The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at I…☆18Feb 17, 2023Updated 3 years ago
- ☆49Apr 12, 2024Updated 2 years ago
- Self-Supervised Speech Pre-training and Representation Learning Toolkit.☆10Feb 29, 2024Updated 2 years ago
- This repository includes the code to reproduce our paper [Explainable deepfake and spoofing detection: an attack analysis using SHapley A…☆12Jan 24, 2024Updated 2 years ago
- A study of the downstream instability of word embeddings☆12Aug 23, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Preschool evaluation is crucial because it gives teachers and parents influential knowledge about children's growth and development. The …☆20May 24, 2023Updated 2 years ago
- ☆25Jun 14, 2022Updated 3 years ago
- Normalized Wasserstein for Mixture Distributions☆11Mar 24, 2023Updated 3 years ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Implementation of "Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation"☆13Oct 31, 2024Updated last year
- Matlaba and Python Solutions on machine learnign coursera on Coursera by Andrew Ng☆10Jun 23, 2018Updated 7 years ago
- ☆16Jun 13, 2024Updated last year
- unofficial implementation of "CPTNN: CROSS-PARALLEL TRANSFORMER NEURAL NETWORK FOR TIME-DOMAIN SPEECH ENHANCEMENT"☆15Nov 14, 2023Updated 2 years ago
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Jun 21, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Simple Kaldi recipe for forced alignment☆11Jul 16, 2023Updated 2 years ago
- Voice Conversion method based on speaker style☆14Aug 7, 2021Updated 4 years ago
- Csenet: Complex Squeeze-and-Excitation Network for Speech Depression Level Prediction (ICASSP 2022)☆14Jun 23, 2022Updated 3 years ago
- 2020年互联网+方言转换☆14Nov 2, 2020Updated 5 years ago
- Goodness of Pronunciation using Kaldi on Epa-DB database☆35Jan 17, 2024Updated 2 years ago
- ☆58Dec 2, 2024Updated last year
- This repository includes training, inference, evaluation, and utility scripts developed for fine-tuning the Whisper medium.en model on Ai…☆26Oct 9, 2024Updated last year
- Leveraging BERT to Improve Spoken Language Identification☆17Nov 22, 2022Updated 3 years ago
- In this repository, I try to combine k2 with speechbrain to decode well and fastly.☆16Jun 17, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Parses a document (scanned or phone captured) and returns the underlying question - answer layout structured capture by LayoutXLM model☆10Jun 14, 2021Updated 4 years ago
- Code for the paper "Transcribing Human Piano Performances into Music Notation" accepted at ISMIR 2016☆17Jan 24, 2018Updated 8 years ago
- Audio tokenization, in the fastest way possible!☆54Aug 26, 2024Updated last year
- convert PubLayNet data into METS/PAGE-XML☆10Mar 17, 2020Updated 6 years ago
- A neural speech codec based on discrete WavLM representations☆26Aug 28, 2024Updated last year
- libsoni: A Python Toolbox for Sonifying Music Annotations and Feature Representations☆26Mar 24, 2025Updated last year
- Acoustic distance measure for comparing pronunciations☆17Aug 2, 2022Updated 3 years ago