koudounasalkis / Audio-Speech-Tutorial
This repository contains a short introduction on the topic of audio and speech processing -- from basics to applications.
β20Updated last year
Alternatives and similar repositories for Audio-Speech-Tutorial:
Users that are interested in Audio-Speech-Tutorial are comparing it to the libraries listed below
- This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Spβ¦β12Updated last year
- πΉ pyannote + π notebook = pyannotebookβ26Updated last year
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024β18Updated last month
- Code for the winning solution in the SE&R 2022 Challenge - SER track.β13Updated last year
- Speech-MASSIVE is a multilingual Spoken Language Understanding (SLU) dataset comprising the speech counterpart for a portion of the MASSIβ¦β20Updated 5 months ago
- This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fiβ¦β36Updated 6 months ago
- Speakerbox: Fine-tune Audio Transformers for speaker identification.β53Updated last month
- Rescoring methods for end-to-end Automatic Speech Recognitionβ27Updated 4 years ago
- Machine learning speaker characteristicsβ33Updated this week
- African accented clinical and general domain TTSβ10Updated 7 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.β19Updated 3 months ago
- This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.β27Updated 11 months ago
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.β13Updated last year
- β13Updated last year
- Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentationβ15Updated last year
- Speaker change detection using SincNet and an LSTM/Transformerβ46Updated 7 months ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) togetherβ44Updated last year
- Goodness of Pronunciation algorithm using PyKaldiβ15Updated 2 years ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!β33Updated last year
- Repository having the code and models from the paper: data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student trainiβ¦β11Updated 10 months ago
- β19Updated last year
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.β50Updated 2 years ago
- Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioningβ11Updated 7 months ago
- β9Updated 3 months ago
- π LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.β22Updated 5 years ago
- Codes and datasets for our ICASSP2023 paper, Evaluating parameter-efficient transfer learning approaches on SURE benchmark for speech undβ¦β43Updated last year
- Prosodic Speech Segmentation with Transformersβ25Updated 11 months ago
- Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speechβ10Updated last year
- A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Featuresβ11Updated 2 years ago
- Pytorch implementation of INTEGRATED PARAMETER-EFFICIENT TUNING FOR GENERAL-PURPOSE AUDIO MODELSβ10Updated last year