ErikEkstedt / TurnGPT
TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog
☆47Updated 9 months ago
Alternatives and similar repositories for TurnGPT:
Users that are interested in TurnGPT are comparing it to the libraries listed below
- Datasets for turn-taking research☆12Updated last year
- ☆34Updated 3 years ago
- Voice Activity Projection Models: Self-supervised learning of Turn-taking Events☆53Updated 9 months ago
- ☆21Updated 6 years ago
- ☆13Updated 2 years ago
- ☆35Updated 6 months ago
- 56 language, 1 model Multilingual ASR☆25Updated 3 years ago
- asr2k☆49Updated 9 months ago
- A multilingual phoneme recognizer capable of generalizing zero-shot to unseen phoneme inventories.☆19Updated 4 months ago
- Word Discovery in Visually Grounded, Self-Supervised Speech Models☆26Updated last year
- vad☆15Updated last year
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆138Updated this week
- EMNLP 23 - Integrating Whisper Encoder to LLaMA Decoder for Generative ASR Error Correction☆249Updated 9 months ago
- Explore different way to mix speech model(wav2vec2, hubert) and nlp model(BART,T5,GPT) together☆44Updated last year
- Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation☆143Updated last year
- ☆37Updated 3 years ago
- This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingfac…☆87Updated 8 months ago
- Repository containing the open source code of works published at the FBK MT unit.☆42Updated last month
- Code for AccentDB.☆20Updated 3 years ago
- Implementation of the model "AudioFlamingo" from the paper: "Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dial…☆40Updated last month
- Collection of scripts from mHuBERT-147.☆24Updated 3 months ago
- Official code for Wav2Seq☆96Updated 2 years ago
- A spoken question answering dataset on SQUAD☆46Updated 2 years ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆136Updated this week
- **Interspeech 2022** 《SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks》Speec…☆98Updated last year
- SpeechGLUE is a speech version of the GLUE benchmark, driven by text-to-speech.☆13Updated last year
- Speaker change detection using SincNet and an LSTM/Transformer☆47Updated 8 months ago
- Official code for Interspeech 2023 paper "Self-supervised Fine-tuning for Improved Content Representations by Speaker-invariant Clusterin…☆49Updated last year
- ☆73Updated this week
- Prosodic Speech Segmentation with Transformers☆25Updated last year