jack-tol / youtube-to-audio
A lightweight Python package and command-line interface (CLI) tool that extracts audio from YouTube videos and playlists in multiple formats, such as MP3, WAV, OGG, AAC, and FLAC.
☆12Updated 3 weeks ago
Alternatives and similar repositories for youtube-to-audio:
Users that are interested in youtube-to-audio are comparing it to the libraries listed below
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆61Updated 3 weeks ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated 3 months ago
- Turn text from websites into spoken audio with edge-tts, F5, etc. and save as mp3 files☆45Updated last month
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- ☆16Updated last year
- ☆104Updated this week
- ☆62Updated 8 months ago
- Joint speech-language model - respond directly to audio!☆30Updated 10 months ago
- ☆107Updated last year
- The next evolution of Agents☆48Updated 2 weeks ago
- AI Search engine☆12Updated last month
- Open TTS models, built for streaming on the edge☆39Updated 2 weeks ago
- Efficient approach to speaker diarization using voice characteristics extraction☆93Updated 11 months ago
- ☆46Updated 4 months ago
- Use quantized versions of Whisper to speed up inference☆12Updated 5 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆91Updated 9 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆21Updated this week
- This repo provides a simple Gradio UI to run Qwen2 VL 72B AWQ in venv and have both image and video inferencing work.☆29Updated 6 months ago
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆21Updated 9 months ago
- ☆92Updated 3 months ago
- 🐍 🤖 Pip installable package for StyleTTS 2 human-level text-to-speech and voice cloning☆156Updated 8 months ago
- Choose a topic, a music genre and wait for the agents to generate a song☆53Updated 9 months ago
- [Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation☆145Updated last month
- Retrieve the source code for any model made available on replicate.com!☆34Updated last year
- Collection of Open Source Speech Data☆152Updated 4 months ago
- ☆37Updated last year
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)☆21Updated 6 months ago
- Simli WebRTC AI Agent demo☆20Updated 4 months ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Updated last year