AI4Bharat / vistaar
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
☆46Updated 6 months ago
Alternatives and similar repositories for vistaar:
Users that are interested in vistaar are comparing it to the libraries listed below
- Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2☆82Updated 10 months ago
- NPTEL2020: Speech2Text dataset for Indian-English Accent☆71Updated 3 years ago
- ☆42Updated 2 years ago
- ☆41Updated 2 years ago
- Swarah: Indian-English speech dataset collected across the country☆26Updated last year
- Repository containing experimentation platform on how to train, infer on wav2vec2 models.☆86Updated 2 years ago
- Towards Building Text-To-Speech Systems for the Next Billion Users - Microsoft Research Intern Work - Accepted at ICASSP 2023☆50Updated last year
- This project is about performing Speaker diarization for Hindi Language.☆47Updated 3 years ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆76Updated 6 months ago
- An implementation of the paper titled "Arabic Speech Emotion Recognition Employing Wav2vec2.0 and HuBERT Based on BAVED Dataset" https://…☆12Updated 2 years ago
- Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.☆77Updated last year
- A python package for whisper normalizer☆46Updated last month
- Indic TTS for Indian Languages: This is a project on developing text-to-speech (TTS) synthesis systems for Indian languages, improving qu…☆14Updated 11 months ago
- An end-to-end system which makes use of a recurrent encoder-decoder deep neural network to translate speech from the Hindi (Fourth most s…☆18Updated 5 years ago
- Finetune Wa2vec 2.0 For Speech Recognition☆121Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆139Updated last year
- Code for the paper: GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities☆102Updated last month
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆46Updated 6 months ago
- INTERSPEECH 23 - Refunction Whisper to recognize new tasks with adapters!☆33Updated last year
- Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning (ASRU2023)☆27Updated last year
- Indic-Conformer models for ASR☆15Updated 5 months ago
- asr2k☆48Updated 7 months ago
- Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models☆147Updated last year
- An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning …☆32Updated 2 years ago
- A mini, simple, and fast end-to-end automatic speech recognition toolkit.☆50Updated 2 years ago
- GSoC'2021 | TensorFlow implementation of Wav2Vec2☆91Updated 3 years ago
- Code and data repository for paper "VoxCeleb enrichment for Age and Gender recognition" submitted at ASRU 2021☆66Updated 3 years ago
- Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO☆60Updated 2 years ago
- Clustering-based methods for overlapping diarization☆74Updated last year