llm-jp / llama-mimiLinks
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequences of interleaved semantic and acoustic tokens.
☆25Updated last month
Alternatives and similar repositories for llama-mimi
Users that are interested in llama-mimi are comparing it to the libraries listed below
Sorting:
- VITS2 using Phoneme-Level Japanese BERT☆14Updated last year
- ☆15Updated 3 months ago
- A TTS Trained on Universal Audio.☆39Updated 4 months ago
- Text-to-Speech Latency Benchmark☆18Updated 4 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆17Updated last year
- This is not remotely close to a finished product, and does not intend to nor does this claim to be working fine-tuning code for MaskGCT. …☆12Updated 10 months ago
- Fine-tuning Moshi/J-Moshi on your own spoken dialogue data☆73Updated 3 months ago
- My vocoder experiments☆31Updated 3 months ago
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆52Updated last month
- Forced alignment decoder for Whisper.☆14Updated last year
- 44100Hz日本語音源に対応させた unofficial vits2-TTS implementation in pytorchです。☆22Updated 2 years ago
- Nue-ASR inference code by rinna Co., Ltd.☆35Updated last month
- ☆48Updated last year
- Speaker embedding for anime speech domain based on ECAPA_TDNN☆15Updated 4 months ago
- a Neural Vocoder supporting Ring Attention, Conformer and NSF.☆21Updated 3 months ago
- Multi-lingual AudioCaps☆11Updated last year
- ☆14Updated last year
- unofficial pytorch implementation of HiFi-GAN with fast MISR.☆15Updated 2 years ago
- MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline. (Accepted by IALP'2022)☆22Updated 2 years ago
- ☆13Updated last year
- ☆14Updated 2 years ago
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 5 months ago
- JATTS: A modern, research-oriented Japanese Text-to-speech Open-sourced Toolkit☆42Updated 5 months ago
- text to speech☆10Updated last year
- ☆13Updated this week
- Googleの音声復元モデルMiipher-2の再現実装の学習および推論コード。学習済みモデルも公開しています。☆24Updated 3 months ago
- Conformer block with Rotary Position Embedding, modified from lucidrains' implement☆16Updated last year
- A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts☆15Updated 10 months ago
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Updated last year
- ☆16Updated last year