skit-ai / SpeechLLM
This repository contains the training, inference, evaluation code for SpeechLLM models and details about the model releases on huggingface.
☆102Updated 10 months ago
Alternatives and similar repositories for SpeechLLM:
Users that are interested in SpeechLLM are comparing it to the libraries listed below
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆175Updated 9 months ago
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆184Updated last week
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆153Updated last month
- a curated list of speech datasets (110+ datasets, 75+ easy to download)☆130Updated 2 years ago
- Audio Codec Speech processing Universal PERformance Benchmark☆252Updated 3 weeks ago
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆138Updated 2 weeks ago
- Reference-aware automatic speech evaluation toolkit☆153Updated 5 months ago
- Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context☆194Updated 7 months ago
- UTokyo-SaruLab MOS Prediction System☆175Updated last month
- Paper, Code and Statistics for Self-Supervised Learning and Pre-Training on Speech.☆205Updated last year
- Target Speaker Extraction Toolkit☆164Updated 3 weeks ago
- AudioBench: A Universal Benchmark for Audio Large Language Models☆201Updated last month
- FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3☆196Updated last year
- Official implementation for Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning☆88Updated 5 months ago
- Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"☆158Updated 7 months ago
- wav2vec2 audio classification for prosodic boundary detection and other tasks☆42Updated last year
- The implementation for "Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions"☆39Updated 3 weeks ago
- Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.☆117Updated last year
- Update ASR paper everyday☆201Updated this week
- ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations☆156Updated last year
- The official Pytorch implementation of "Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based …☆131Updated 2 months ago
- Training code for FAcodec presented in NaturalSpeech3☆204Updated 8 months ago
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆142Updated last year
- This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.☆108Updated last year
- This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent spee…☆75Updated 10 months ago
- UT-Sarulab MOS prediction system using SSL models☆232Updated last year
- ConMamba for Automatic Speech Recognition☆72Updated 8 months ago
- A Survey of Spoken Dialogue Models (60 pages)☆293Updated 5 months ago
- The official repository of Dynamic-SUPERB.☆180Updated last month
- Audio-FLAN☆142Updated 2 months ago