mbotsu / mlx_speech2text
Audio transcription using mlx whisper and vad silence processing
☆13Updated 4 months ago
Alternatives and similar repositories for mlx_speech2text:
Users that are interested in mlx_speech2text are comparing it to the libraries listed below
- cli tool to quantize gguf, gptq, awq, hqq and exl2 models☆69Updated 2 months ago
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient…☆48Updated 3 weeks ago
- ☆18Updated 6 months ago
- Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX☆16Updated 4 months ago
- HunyuanVideo: A Systematic Framework For Large Video Generation Model☆44Updated 2 months ago
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 2 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆20Updated 2 weeks ago
- ☆14Updated 3 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆26Updated 4 months ago
- Ultra-minimal autoregressive diffusion model for image generation☆18Updated 5 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆84Updated 2 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆103Updated 4 months ago
- ☆12Updated last year
- ☆24Updated last month
- (Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on …☆82Updated last week
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 7 months ago
- Video+code lecture on building nanoGPT from scratch☆65Updated 8 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆91Updated 8 months ago
- Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.☆37Updated last week
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆43Updated 7 months ago
- mlx image models for Apple Silicon machines☆73Updated 3 months ago
- This repo provides a simple Gradio UI to run Qwen2 VL 72B AWQ in venv and have both image and video inferencing work.☆29Updated 5 months ago
- Experiments with BitNet inference on CPU☆53Updated 11 months ago
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in MLX☆19Updated 4 months ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Updated 2 months ago
- A little file for doing LLM-assisted prompt expansion and image generation using Flux.schnell - complete with prompt history, prompt queu…☆26Updated 6 months ago
- ☆22Updated 4 months ago
- Screenshot LLM is a Python application that leverages the power of AI to analyze screenshots. Built with PyQt6 for a user-friendly interf…☆37Updated 4 months ago