AIDASLab / MathReaderLinks
Implementation of MathReader, Text-to-Speech for Mathematical Documents
☆24Updated 3 months ago
Alternatives and similar repositories for MathReader
Users that are interested in MathReader are comparing it to the libraries listed below
Sorting:
- Official repository of the IEEE SLT 2024 paper "Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT"☆44Updated 2 months ago
- ☆16Updated 8 months ago
- ☆38Updated last year
- Code associated with the paper: CTC-DRO: Robust Optimization for Reducing Language Disparities in Speech Recognition.☆15Updated 7 months ago
- A package for NeuCodec: a 50hz, 0.8kbps, 24kHz audio codec.☆135Updated 2 months ago
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Updated 11 months ago
- Audio tokenization, in the fastest way possible!☆53Updated last year
- ☆50Updated 2 weeks ago
- ☆41Updated 10 months ago
- ☆13Updated last year
- ☆30Updated 2 months ago
- This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147.☆19Updated last year
- ☆48Updated 5 months ago
- ☆41Updated 5 months ago
- ☆38Updated 8 months ago
- ☆19Updated last year
- This repository contains the code and data for the paper EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control by Haozhe Chen,…☆80Updated last year
- Official implementation of paper: Shallow Flow Matching for Coarse-to-Fine Text-to-Speech Synthesis☆47Updated 3 months ago
- [ACL 2024] Generative Pre-Trained Speech Language Model with Efficient Hierarchical Transformer☆66Updated last year
- A TTS Trained on Universal Audio.☆41Updated 6 months ago
- PyTorch implementation of Miipher-2 [2025] which is a speech restoration model by Google DeepMind☆61Updated 3 months ago
- ☆15Updated last month
- ☆103Updated 2 months ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆49Updated 3 months ago
- Collection of scripts from mHuBERT-147.☆32Updated last year
- A fast speech-to-speech & speech-to-text translation model that supports simultaneous decoding and offers 28× speedup.☆77Updated last year
- Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequ…☆28Updated 3 months ago
- Streamable Text-to-Speech model using a language modeling approach, without vector quantization☆106Updated 7 months ago
- small audio language model for reasoning☆83Updated 3 weeks ago
- ☆17Updated 6 months ago