cansik / onnxruntime-silicon
ONNX Runtime prebuilt wheels for Apple Silicon (M1 / M2 / M3 / ARM64)
β207Updated 8 months ago
Alternatives and similar repositories for onnxruntime-silicon:
Users that are interested in onnxruntime-silicon are comparing it to the libraries listed below
- Deploy stable diffusion model with onnx/tenorrt + tritonserverβ123Updated last year
- Use safetensors with ONNX π€β50Updated 3 weeks ago
- π | Python library for RunPod API and serverless worker SDK.β217Updated last week
- βοΈ | REPLACED BY https://github.com/runpod-workers | Official set of serverless worker provided by RunPod as endpoints.β57Updated last year
- β55Updated 2 years ago
- ONNX implementation of Whisper. PyTorch free.β92Updated 4 months ago
- Python bindings for ggmlβ140Updated 7 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and othersβ46Updated last year
- FlashAttention (Metal Port)β465Updated 6 months ago
- mlx implementations of various transformers, speedups, trainingβ34Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.β46Updated last year
- Attempt at cog wrapper using ComfyUI to run a SDXL txt2img workflow configβ23Updated last year
- FRP Forkβ158Updated 3 weeks ago
- ONNX-Powered Inference for State-of-the-Art Face Upscalersβ93Updated 8 months ago
- β53Updated 2 years ago
- A Gradio component designed to continuously show any logs.β40Updated 3 months ago
- mlx image models for Apple Silicon machinesβ76Updated 4 months ago
- A curated list of amazing RunPod projects, libraries, and resourcesβ108Updated 7 months ago
- Flux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x fastβ¦β256Updated 5 months ago
- A set of custom nodes for ComfyUI that allow you to use Core ML models in your ComfyUI workflows.β157Updated 7 months ago
- Examples of models deployable with Trussβ166Updated this week
- Optimum version of a UI for Stable Diffusion, running on ONNX models for faster inference, working on most common GPU vendors: NVIDIA,AMDβ¦β24Updated last year
- β84Updated last year
- Port of Suno's Bark TTS transformer in Apple's MLX Frameworkβ78Updated last year
- It is a simple library to speed up CLIP inference up to 3x (K80 GPU)β215Updated last year
- Python tools for WhisperKit: Model conversion, optimization and evaluationβ209Updated 2 months ago
- A high-throughput and memory-efficient inference and serving engine for Whisper, https://mesolitica.com/blog/vllm-whisperβ25Updated 8 months ago
- β52Updated 2 years ago
- Running the F5-TTS by ONNX Runtimeβ135Updated this week
- Docker image for Audiocraft audio processing and generation with deep learningβ1Updated 9 months ago