kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
7,506Updated last week

Alternatives and similar repositories for moshi:

Users that are interested in moshi are comparing it to the libraries listed below