wntg / LLaMA-Omni
llama-omni训练代码复现
☆59Updated 2 months ago
Alternatives and similar repositories for LLaMA-Omni:
Users that are interested in LLaMA-Omni are comparing it to the libraries listed below
- An easy-to-use, fast, and easily integrable tool for evaluating audio LLM☆84Updated this week
- LUCY: Linguistic Understanding and Control Yielding Early Stage of Her☆36Updated last week
- Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction☆180Updated last month
- Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.☆164Updated 5 months ago
- The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.☆120Updated last week
- AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension☆89Updated 4 months ago
- ☆60Updated 3 weeks ago
- A Survey of Spoken Dialogue Models (60 pages)☆287Updated 4 months ago
- TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loud…☆93Updated 4 months ago
- Audio-FLAN☆142Updated last month
- VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling☆73Updated 5 months ago
- 🤗 R1-AQA Model: mispeech/r1-aqa☆236Updated 3 weeks ago
- The open source code for LLM-Codec☆132Updated 8 months ago
- An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement☆152Updated last month
- Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)☆142Updated last year
- Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice☆285Updated 3 months ago
- [NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words☆49Updated 9 months ago
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆68Updated 5 months ago
- Official release of StyleTalk dataset.☆62Updated 9 months ago
- This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".☆136Updated this week
- VoiceBench: Benchmarking LLM-Based Voice Assistants☆177Updated last week
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆305Updated 3 months ago
- Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization☆174Updated 9 months ago
- ☆22Updated 7 months ago
- MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction mode…☆201Updated 3 months ago
- ☆13Updated last year
- Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis☆253Updated last month
- BLSP-Emo: Towards Empathetic Large Speech-Language Models☆43Updated 10 months ago
- Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"☆59Updated 2 months ago
- BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing☆51Updated last year