Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient processing for low-resource systems.
☆181Feb 26, 2026Updated 3 months ago
Alternatives and similar repositories for pdf-narrator
Users that are interested in pdf-narrator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Transform your PDFs into captivating audio podcasts with this PDF-to-Podcast pipeline! Combining advanced language models and high-qualit…☆17Nov 11, 2024Updated last year
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆1,526Apr 8, 2026Updated last month
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆54Apr 13, 2026Updated last month
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆10Sep 22, 2024Updated last year
- An open-source read-along document reader server with high-quality TTS options, synchronized highlighting, and audiobook export for EPUB,…☆343May 22, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Automatically convert epubs to audiobooks☆258Mar 8, 2025Updated last year
- StyleTTS 2 Optimized Training Fork☆32Feb 2, 2025Updated last year
- epub2tts-kokoro is a free and open source python app to easily create a full-featured audiobook from an epub or text file using realistic…☆32Feb 14, 2026Updated 3 months ago
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆39Oct 28, 2025Updated 7 months ago
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆16Updated this week
- A self-hosted version of WaterCrawl, a powerful web crawling and data extraction platform.☆13Jul 27, 2025Updated 10 months ago
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/multiplatform CPU, AMD, NVIDIA GPU PyTorch support, handling, and auto-s…☆4,866May 19, 2026Updated last week
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆20Oct 13, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 11 months ago
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆21May 20, 2025Updated last year
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Forced alignment decoder for Whisper.☆16Mar 13, 2024Updated 2 years ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- Equal Loudness Filter☆11Mar 4, 2019Updated 7 years ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆49Jan 19, 2026Updated 4 months ago
- Scaled Uniform Noise for Ancestral & Stochastic samplers and Noisy latent image☆17Mar 30, 2025Updated last year
- TTS with kokoro and onnx runtime☆2,557Jan 30, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆18Jan 15, 2026Updated 4 months ago
- Convert Files / Folders / GitHub Repos Into AI / LLM-ready Files☆163Jan 31, 2025Updated last year
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆19Apr 10, 2025Updated last year
- Demo repository for creating a custom chatbot powered by LLMs for Telegram and Whatsapp.☆15Jan 18, 2024Updated 2 years ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆108Nov 19, 2025Updated 6 months ago
- OmegaViT (ΩViT) is a cutting-edge vision transformer architecture that combines multi-query attention, rotary embeddings, state space mod…☆14May 18, 2026Updated last week
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆26Feb 1, 2026Updated 3 months ago
- This tool will help you build a 3D character rig without building it yourself from scratch. It will save you hours if not days of rigging…☆27Aug 7, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆39Sep 9, 2025Updated 8 months ago
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 9 months ago
- A Rust library for building emulators based on various ZX Spectrum computer models and clones.☆11Aug 2, 2023Updated 2 years ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆106Apr 12, 2026Updated last month
- For loading and running Pixtral models☆78Jan 31, 2025Updated last year
- IPA Phonetic dataset lexicon☆18May 10, 2026Updated 2 weeks ago
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆51Jan 13, 2025Updated last year