itsmostafa / inference-speed-testsLinks
Local LLM inference speed tests on various devices
☆79Updated 3 months ago
Alternatives and similar repositories for inference-speed-tests
Users that are interested in inference-speed-tests are comparing it to the libraries listed below
Sorting:
- Optimized Ollama LLM server configuration for Mac Studio and other Apple Silicon Macs. Headless setup with automatic startup, resource op…☆199Updated 3 months ago
- Your gateway to both Ollama & Apple MlX models☆137Updated 3 months ago
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆424Updated 2 weeks ago
- ☆204Updated last month
- Mocking OpenAI API through ChatGPT Desktop app☆213Updated last week
- Library to traverse and control MacOS☆151Updated 2 months ago
- An implementation of the CSM(Conversation Speech Model) for Apple Silicon using MLX.☆359Updated last month
- ☆145Updated last month
- This is a cross-platform desktop application that allows you to chat with locally hosted LLMs and enjoy features like MCP support☆221Updated last week
- Local LLM Powered Recursive Search & Smart Knowledge Explorer☆243Updated 4 months ago
- AI agent that controls computer with OS-level tools, MCP compatible, works with any model☆85Updated 2 months ago
- A wannabe Ollama equivalent for Apple MlX models☆68Updated 3 months ago
- Easy to use interface for the Whisper model optimized for all GPUs!☆225Updated this week
- MCP server that execute applescript giving you full control of your Mac☆281Updated last month
- open source assistant using small models (2b - 5b) , with agentic and tool calling capabilities and integration of RAG with effiecient …☆201Updated 3 weeks ago
- Finally, an open source Youtube Summarizer extension☆73Updated 2 months ago
- a Repository of Open-WebUI tools to use with your favourite LLMs☆232Updated last week
- The easiest way to run the fastest MLX-based LLMs locally☆287Updated 7 months ago
- ☆184Updated 2 months ago
- MacOS menu‑bar utility to adjust Apple Silicon GPU VRAM allocation☆199Updated 2 months ago
- Local image and music generation for Apple Silicon☆52Updated 3 months ago
- High-performance MLX-based LLM inference engine for macOS with native Swift implementation☆214Updated this week
- Command-line personal assistant using your favorite proprietary or local models with access to over 30+ tools☆109Updated 2 months ago
- A multi-agent AI architecture that connects 25+ specialized agents through n8n and MCP servers. Project NOVA routes requests to domain-sp…☆187Updated 2 weeks ago
- ☆103Updated last month
- A Conversational Speech Generation Model with Gradio UI and OpenAI compatible API. UI and API support CUDA, MLX and CPU devices.☆190Updated last month
- FastMLX is a high performance production ready API to host MLX models.☆308Updated 3 months ago
- ☆165Updated 2 months ago
- reddacted lets you analyze & sanitize your online footprint using LLMs, PII detection & sentiment analysis to identify anything that migh…☆100Updated 3 weeks ago
- A multi-agent AI research system designed to know what it knows (and doesn't know) when conducting research and creating content.☆163Updated 4 months ago