wntg / LLaMA-Omni
llama-omni训练代码复现
☆55Updated last month
Alternatives and similar repositories for LLaMA-Omni:
Users that are interested in LLaMA-Omni are comparing it to the libraries listed below
- Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.☆161Updated 4 months ago
- A Survey of Spoken Dialogue Models (60 pages)☆271Updated 3 months ago
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆81Updated 3 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆110Updated 2 months ago
- An easy-to-use, fast, and easily integrable tool for evaluating audio LLM☆62Updated last week
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆138Updated last year
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆65Updated 4 months ago
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆142Updated 2 weeks ago
- The open source code for LLM-Codec☆132Updated 6 months ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆93Updated 2 months ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆48Updated 8 months ago
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆270Updated 2 months ago
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆121Updated 2 months ago
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆65Updated 4 months ago
- Awesome Neural Codec Models, Text-to-Speech Synthesizers & Speech Language Models☆118Updated this week
- Update ASR paper everyday☆161Updated this week
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆289Updated 2 months ago
- Official release of StyleTalk dataset.☆62Updated 8 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆169Updated 8 months ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆54Updated last month
- Grapheme-to-Phoneme for Mixed Chinese (Mandarin or Cantonese) and English.☆86Updated 3 weeks ago
- ☆68Updated last year
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆140Updated this week
- ☆45Updated last month
- ☆13Updated last year
- The repoduction codes for Qwen-Audio Fine-tuning☆35Updated 7 months ago