yukiarimo / hanasu
Hanasu is a human-like TTS model based on the multilingual Himitsu V1 transformer-based encoder and VITS architecture
☆26Updated 3 weeks ago
Alternatives and similar repositories for hanasu:
Users that are interested in hanasu are comparing it to the libraries listed below
- Run Orpheus 3B Locally with Gradio UI, Standalone App☆20Updated last month
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 4 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆34Updated 9 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆27Updated 6 months ago
- Implementation of Sesame's Conversational Speech Model for Hugging Face Transformers☆53Updated 3 weeks ago
- Deploy Apollo HF space locally☆40Updated 4 months ago
- Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.☆16Updated last month
- TTS support with GGML☆32Updated this week
- Whisper STT + Orpheus TTS + Gemma 3 using LM Studio to create a virtual assistant.☆48Updated this week
- zero-shot realtime TTS system, fully offline, free and open source☆34Updated 2 weeks ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆23Updated last month
- Real‑time, low‑latency voice, vision, and conversational‑memory AI assistant built on LiveKit and local LLMs☆23Updated 2 weeks ago
- This repo provides a simple Gradio UI to run Qwen2 VL 72B AWQ in venv and have both image and video inferencing work.☆30Updated 7 months ago
- ☆66Updated last week
- Dou (道) - AI powered analysis and feedback for notes and mind maps☆28Updated 2 weeks ago
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆26Updated 2 months ago
- Analyze Reddit posts☆26Updated 2 months ago
- LLM backed Fantasy Tribe Game☆18Updated 5 months ago
- Automatically cleaning, enhancing, segmenting, filtering, and formatting a dataset to fine tune or train a voice model.☆35Updated last week
- Dia-JAX: A JAX port of Dia, the text-to-speech model for generating realistic dialogue from text with emotion and tone control.☆20Updated this week
- StyleTTS 2 Optimized Training Fork☆28Updated 3 months ago
- Orpheus Chat WebUI☆53Updated last month
- Use the Moondream 2 model to detect faces and their gaze directions in videos.☆39Updated 3 months ago
- Think of it as giving your AI a searchable diary and knowledge base that grows with every conversation.☆15Updated this week
- ☆33Updated 11 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆17Updated 6 months ago
- Experimental LLM Inference UX to aid in creative writing☆116Updated 4 months ago
- Automated LLM novelist☆45Updated last year
- A lightweight, efficient variation of the StyleTTS 2 text‐to‐speech model.☆14Updated last week
- LangoTango - A local language model powered language learning partner☆22Updated last week