jack-tol / youtube-to-audio
A lightweight Python package and command-line interface (CLI) tool that extracts audio from YouTube videos and playlists in multiple formats, such as MP3, WAV, OGG, AAC, and FLAC.
☆11Updated last month
Related projects ⓘ
Alternatives and complementary repositories for youtube-to-audio
- Video+code lecture on building nanoGPT from scratch☆64Updated 4 months ago
- VoiceRestore: Flow-Matching Transformers for Universal Speech Restoration☆81Updated last month
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆39Updated 2 weeks ago
- ☆252Updated 7 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆45Updated this week
- ☆188Updated 5 months ago
- Sing an idea ➡️ AI music sample🔥🎶☆90Updated 6 months ago
- ☆87Updated 6 months ago
- ☆61Updated 3 months ago
- Efficient approach to speaker diarization using voice characteristics extraction☆67Updated 6 months ago
- Speech recognition & diarisation solution with text alignment, deployed in AML pipelines☆84Updated 6 months ago
- On-device streaming text-to-speech engine powered by deep learning☆54Updated last week
- Collection of Open Source Speech Data☆144Updated this week
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆135Updated 3 months ago
- ☆51Updated last month
- ☆77Updated 4 months ago
- Interface for OuteTTS models.☆347Updated this week
- VALL-E 2 reproduction☆83Updated 3 months ago
- Joint speech-language model - respond directly to audio!☆355Updated 4 months ago
- Text-to-Music Generation with Rectified Flow Transformer☆45Updated 2 months ago
- Scripts to create your own moe models using mlx☆86Updated 8 months ago
- Uses deepgram/whisper/custom models to create an LJSpeech dataset for voice model fine tuning☆12Updated this week
- ☆256Updated 4 months ago
- AI 3D avatar voice interface in browser. VAD -> STT -> LLM -> TTS -> VRM (Prototype/Proof-of-Concept)☆64Updated last year
- Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch☆340Updated last week
- ☆171Updated 11 months ago
- Speaker Diarization with Transformers☆59Updated 5 months ago
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆70Updated 3 weeks ago
- Fine-tune your own MusicGen with LoRA☆106Updated 6 months ago
- ☆65Updated 3 weeks ago