Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient processing for low-resource systems.
☆178Feb 26, 2026Updated 2 months ago
Alternatives and similar repositories for pdf-narrator
Users that are interested in pdf-narrator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Transform your PDFs into captivating audio podcasts with this PDF-to-Podcast pipeline! Combining advanced language models and high-qualit…☆17Nov 11, 2024Updated last year
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆1,439Apr 8, 2026Updated last month
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆10Sep 22, 2024Updated last year
- An open-source read-along document reader server with high-quality TTS options, synchronized highlighting, and audiobook export for EPUB,…☆316Updated this week
- epub2tts-kokoro is a free and open source python app to easily create a full-featured audiobook from an epub or text file using realistic…☆30Feb 14, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆39Oct 28, 2025Updated 6 months ago
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆16Apr 13, 2026Updated 3 weeks ago
- The EveryVoice TTS Toolkit - Text To Speech for your language☆43May 1, 2026Updated last week
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆4,811Jan 4, 2026Updated 4 months ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆20Oct 13, 2025Updated 6 months ago
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 10 months ago
- 🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.☆766Mar 11, 2026Updated last month
- Update script for Manjaro☆11Aug 9, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆21May 20, 2025Updated 11 months ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- Equal Loudness Filter☆11Mar 4, 2019Updated 7 years ago
- Scaled Uniform Noise for Ancestral & Stochastic samplers and Noisy latent image☆17Mar 30, 2025Updated last year
- TTS with kokoro and onnx runtime☆2,514Jan 30, 2026Updated 3 months ago
- Convert Files / Folders / GitHub Repos Into AI / LLM-ready Files☆162Jan 31, 2025Updated last year
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆20Apr 10, 2025Updated last year
- Demo repository for creating a custom chatbot powered by LLMs for Telegram and Whatsapp.☆15Jan 18, 2024Updated 2 years ago
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆108Nov 19, 2025Updated 5 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- various experiments for scaling inference time compute with small reasoning models☆17Jan 16, 2025Updated last year
- NewsAgent is an enterprise-grade news aggregation agent designed to fetch, query, and summarize news from multiple sources at scale.☆28Oct 13, 2025Updated 6 months ago
- This tool will help you build a 3D character rig without building it yourself from scratch. It will save you hours if not days of rigging…☆27Aug 7, 2022Updated 3 years ago
- PortableApps.com Development Toolkit☆13Apr 8, 2016Updated 10 years ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆107Apr 12, 2026Updated 3 weeks ago
- IPA Phonetic dataset lexicon☆19Updated this week
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆52Jan 13, 2025Updated last year
- Turn any common eBook file into an HQ Audiobook with F5-TTS (Easy Install)☆39Apr 6, 2026Updated last month
- 过剑网3保护☆11Jul 28, 2017Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Text-to-Speech Benchmark☆24Apr 2, 2026Updated last month
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆14Jun 27, 2023Updated 2 years ago
- High-Fidelity Neural Phonetic Posteriorgrams☆122Feb 23, 2025Updated last year
- Text to speech is an emerging zone of AI. This repository helps to create a dataset with audio and transcripts for personalized text to s…☆28Mar 14, 2023Updated 3 years ago
- ☆11Mar 8, 2022Updated 4 years ago
- Speech Transmission Index (STI) from real speech waveforms☆15May 1, 2011Updated 15 years ago
- ☆42Sep 21, 2025Updated 7 months ago