Convert your PDFs and EPUBs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient processing for low-resource systems.
☆172Feb 26, 2026Updated last month
Alternatives and similar repositories for pdf-narrator
Users that are interested in pdf-narrator are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats includ…☆1,394Apr 8, 2026Updated last week
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆54Apr 13, 2026Updated last week
- StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion☆10Sep 22, 2024Updated last year
- An open-source read-along document reader server with high-quality TTS options, synchronized highlighting, and audiobook export for EPUB,…☆306Updated this week
- Automatically convert epubs to audiobooks☆259Mar 8, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- StyleTTS 2 Optimized Training Fork☆33Feb 2, 2025Updated last year
- Code repository for TIDMAD: Time series Dataset for Discovering Dark Matter with AI Denoising.☆15Apr 1, 2026Updated 2 weeks ago
- Accompanying repository for the paper "DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions"☆39Oct 28, 2025Updated 5 months ago
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Updated this week
- The EveryVoice TTS Toolkit - Text To Speech for your language☆43Apr 10, 2026Updated last week
- Data manipulation and transformation for audio signal processing, powered by PyTorch☆11Sep 30, 2024Updated last year
- Repo of the paper "Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model""☆15Jun 28, 2024Updated last year
- Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching☆4,729Jan 4, 2026Updated 3 months ago
- Transform unstructured documents into actionable, structured data with enterprise-grade precision and reliability, ready for large-scale …☆20Oct 13, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval☆13Jun 27, 2025Updated 9 months ago
- 🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.☆754Mar 11, 2026Updated last month
- SpeechPlus: Small LLM-Based Text-to-Speech Library 🚀☆20May 20, 2025Updated 10 months ago
- Unofficial implementation of ConvNeXt-TTS powered by lightning☆18Oct 20, 2024Updated last year
- Forced alignment decoder for Whisper.☆15Mar 13, 2024Updated 2 years ago
- Project of Singing Voice Conversion.☆16Oct 27, 2023Updated 2 years ago
- Equal Loudness Filter☆11Mar 4, 2019Updated 7 years ago
- Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986☆49Jan 19, 2026Updated 3 months ago
- TTS with kokoro and onnx runtime☆2,468Jan 30, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Text-to-Speech conversor for Basque and Spanish. It includes linguistic processing and built voices for the languages aforementioned. Its…☆18Jan 15, 2026Updated 3 months ago
- Grapheme-to-phoneme tool for corpus conversion, where phonemes match Phoible inventories☆20Apr 10, 2025Updated last year
- Lightweight Gradio based WebUI for orpheusTTS - WSL / Linux [CUDA]☆107Nov 19, 2025Updated 5 months ago
- [EMNLP 2025 Findings] Official code for EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion☆37Sep 9, 2025Updated 7 months ago
- A tool that creates clips from longer videos (e.g YouTube video to social shorts aka OpusClip).☆19Jan 13, 2026Updated 3 months ago
- MFLUX-WEBUI using MLX and the FLUX DEV and Schnell models☆127Feb 15, 2026Updated 2 months ago
- Official PyTorch implementation of (ICME2025 oral) "AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-…☆24Feb 1, 2026Updated 2 months ago
- This tool will help you build a 3D character rig without building it yourself from scratch. It will save you hours if not days of rigging…☆27Aug 7, 2022Updated 3 years ago
- PortableApps.com Development Toolkit☆13Apr 8, 2016Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.☆18Aug 1, 2025Updated 8 months ago
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆106Apr 12, 2026Updated last week
- Synchronize SRT timestamps over an existing accurate transcription☆41Nov 11, 2024Updated last year
- For loading and running Pixtral models☆78Jan 31, 2025Updated last year
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆52Jan 13, 2025Updated last year
- 过剑网3保护☆11Jul 28, 2017Updated 8 years ago
- Official Repository for "Efficient Vocal Source Separation Through Windowed RoFormer"☆45Oct 30, 2025Updated 5 months ago