itsmostafa / inference-speed-testsLinks
Local LLM inference speed tests on various devices
☆103Updated 6 months ago
Alternatives and similar repositories for inference-speed-tests
Users that are interested in inference-speed-tests are comparing it to the libraries listed below
Sorting:
- Your gateway to both Ollama & Apple MlX models☆146Updated 7 months ago
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆573Updated last month
- Optimized Ollama LLM server configuration for Mac Studio and other Apple Silicon Macs. Headless setup with automatic startup, resource op…☆260Updated 7 months ago
- High-performance MLX-based LLM inference engine for macOS with native Swift implementation☆412Updated 2 weeks ago
- A wannabe Ollama equivalent for Apple MlX models☆80Updated 7 months ago
- Accessing Apple Intelligence and ChatGPT desktop through OpenAI / Ollama API☆293Updated last month
- MacOS menu‑bar utility to adjust Apple Silicon GPU VRAM allocation☆227Updated 5 months ago
- AI agent that controls computer with OS-level tools, MCP compatible, works with any model☆109Updated last month
- The easiest way to run the fastest MLX-based LLMs locally☆303Updated 11 months ago
- macOS whisper dictation app☆420Updated this week
- Ollama desktop client for everyday use☆80Updated 4 months ago
- Parse files (e.g. code repos) and websites to clipboard or a file for ingestions by AI / LLMs☆304Updated 2 months ago
- Qwen Image models through MPS☆214Updated 2 weeks ago
- FastMLX is a high performance production ready API to host MLX models.☆330Updated 6 months ago
- Local Apple Notes + LLM Chat☆91Updated 7 months ago
- This is a cross-platform desktop application that allows you to chat with locally hosted LLMs and enjoy features like MCP support☆224Updated 2 months ago
- Support for MLX models in LLM☆217Updated 5 months ago
- An implementation of the Nvidia's Parakeet models for Apple Silicon using MLX.☆515Updated last week
- LM Studio Apple MLX engine☆790Updated last week
- MLX-GUI MLX Inference Server for Apple Silicone☆125Updated last month
- ☆127Updated last week
- Library to traverse and control MacOS☆171Updated 5 months ago
- Nginx proxy server in a Docker container to Authenticate & Proxy requests to Ollama from Public Internet via Cloudflare Tunnel☆141Updated last month
- A macOS AppleScript MCP server☆315Updated 5 months ago
- An implementation of the CSM(Conversation Speech Model) for Apple Silicon using MLX.☆378Updated last month
- Open‑WebUI Tools is a modular toolkit designed to extend and enrich your Open WebUI instance, turning it into a powerful AI workstation. …☆369Updated 2 weeks ago
- Documentation on setting up a local LLM server on Debian from scratch, using Ollama/llama.cpp/vLLM, Open WebUI, Kokoro FastAPI, and Comfy…☆550Updated this week
- Your Local Artificial Memory on your Device.☆493Updated 9 months ago
- ☆189Updated 6 months ago
- Open WebUI Desktop 🌐 (Alpha)☆682Updated 2 months ago