Extend the Conditioning of Stable Diffusion to take Audio Embeddings Instead of Text Embeddings using Wav2Vec2-BERT model
☆13Sep 25, 2024Updated last year
Alternatives and similar repositories for audio2img
Users that are interested in audio2img are comparing it to the libraries listed below
Sorting:
- Vehicle speed estimation using YOLOv9 for object detection and DeepSORT for tracking☆16Sep 13, 2024Updated last year
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- CoreXY conversion for the Folgertech FT-5 printer☆15Feb 20, 2024Updated 2 years ago
- Experimental sampler to make LLMs more creative☆31Aug 2, 2023Updated 2 years ago
- A modified Ziggurat Algorithm for efficiently generating exponentially- and normally-distributed PseudoRandom Numbers (PRNs).☆13May 21, 2025Updated 9 months ago
- ☆15Mar 11, 2025Updated 11 months ago
- A PyTorch implementation of the shearlet transform.☆13Oct 9, 2025Updated 4 months ago
- ComfyUI workflows to create smooth transitions between video clips using Wan VACE. Works with video from any model or other source-LTX-2,…☆31Feb 10, 2026Updated 3 weeks ago
- Roadmap to become a Linux-Fine☆10Jul 13, 2024Updated last year
- Optimization solvers in pure Python: LP, MILP, SAT, constraint programming, graph and metaheuristics. No dependencies. Solvor all your op…☆25Feb 1, 2026Updated last month
- A universal adapter including zero-copy Python bindings for Philip Turner's metal flash attention library.☆23Dec 15, 2025Updated 2 months ago
- ☆39Oct 29, 2025Updated 4 months ago
- Your FREE AWS Journey Starts Here! (O.V.E.R)☆10Feb 12, 2026Updated 3 weeks ago
- An extension for oobabooga´s Text Generation WebUI☆11May 29, 2023Updated 2 years ago
- Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions☆22Feb 11, 2026Updated 3 weeks ago
- This is the repo with the code to conduct a comparative analysis of different audio representation models.☆12Aug 31, 2023Updated 2 years ago
- An agentic runtime that enables secure, extensible and configurable AI automation from any model☆17Updated this week
- JoVA: Unified Multimodal Learning for Joint Video-Audio Generation☆30Dec 22, 2025Updated 2 months ago
- A notebook containing implementations of different graph deep node embeddings along with benchmark graph neural network models in tensorf…☆13Jul 17, 2021Updated 4 years ago
- ☆14Dec 16, 2022Updated 3 years ago
- Experimental rewrite of Jellyfin Desktop built on CEF☆40Updated this week
- A simple SDXL fine-tuning toolkit based on the DreamBooth branch of AutoTrain Advanced from 🤗, inspired by the way ai-toolkit approaches…☆18Sep 30, 2024Updated last year
- Official code for "IT³: Idempotent Test-Time Training" (ICML 2025)☆14Jun 25, 2025Updated 8 months ago
- ☆17Apr 22, 2024Updated last year
- This tool kit provides a quickstart for working with OpenSearch and ML models, especially LLMs for vector embeddings to power sementic an…☆17Feb 25, 2026Updated last week
- For world model code developing and releasing.☆36Feb 6, 2026Updated last month
- miaoshouai-assistant for webui-forge☆15Aug 15, 2024Updated last year
- ☆12Dec 23, 2024Updated last year
- An implementation of LLMzip using GPT-2☆13Aug 7, 2023Updated 2 years ago
- ☆10May 14, 2024Updated last year
- ☆14Sep 19, 2024Updated last year
- A script for merging a LLM model and a LoRA☆13Jun 22, 2023Updated 2 years ago
- RLHF for Video Diffusion Models☆26Jul 30, 2025Updated 7 months ago
- xformers prebuild wheels for various video cards, suitable for both paperspace and google colab☆12Apr 7, 2023Updated 2 years ago
- Re-taking voice conversations to the moon 🚀☆12Nov 9, 2022Updated 3 years ago
- ☆12Oct 23, 2022Updated 3 years ago
- Distributed NeuroSynapse Engine leveraging Predictive Modeling and Streaming Analytics to drive Intelligent Data Insights Explorer.☆31Feb 19, 2026Updated 2 weeks ago
- Agentic BYOK Browser-Based Website Builder☆30Updated this week
- ComfyUI Plugin for Kandinsky2.2 (diffusers version)☆10Apr 2, 2025Updated 11 months ago