aidatatools / ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs)
β203Updated last month
Alternatives and similar repositories for ollama-benchmark:
Users that are interested in ollama-benchmark are comparing it to the libraries listed below
- A proxy server for multiple ollama instances with Key securityβ371Updated last month
- A open webui function for better R1 experienceβ79Updated 3 weeks ago
- beep boop π€β90Updated 2 months ago
- Benchmark llm performanceβ95Updated 8 months ago
- transparent proxy server on demand model swapping for llama.cpp (or any local OpenAPI compatible server)β475Updated this week
- a Repository of Open-WebUI tools to use with your favourite LLMsβ175Updated this week
- Handy tool to measure the performance and efficiency of LLMs workloads.β51Updated last month
- β154Updated this week
- Lightweight Inference server for OpenVINOβ142Updated this week
- Code execution utilities for Open WebUI & Ollamaβ264Updated 4 months ago
- API up your Ollama Server.β141Updated 3 months ago
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.β135Updated 2 weeks ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.β115Updated 10 months ago
- β83Updated 3 months ago
- automatically quant GGUF modelsβ164Updated this week
- β197Updated last week
- A simple to use Ollama autocompletion engine with options exposed and streaming functionalityβ121Updated 5 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask APIβ71Updated 6 months ago
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.β243Updated 3 weeks ago
- Optimized Ollama LLM server configuration for Mac Studio and other Apple Silicon Macs. Headless setup with automatic startup, resource opβ¦β149Updated 2 weeks ago
- Ollama client written in Pythonβ159Updated 3 months ago
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language modelsβ general concepts, deployment techniquβ¦β62Updated 7 months ago
- π Retrieval Augmented Generation (RAG) with txtai. Combine search and LLMs to find insights with your own data.β346Updated 3 months ago
- FastMLX is a high performance production ready API to host MLX models.β281Updated last week
- Ollama chat client in Vue, everything you need to do your private text rpg in browserβ122Updated 5 months ago
- Free Search is a wrapper on top of publicly available SearXNG instances to give free internet access as a rest API.β14Updated last week
- Link you Ollama models to LM-Studioβ132Updated 8 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differeβ¦β55Updated last month
- Turns devices into a scalable LLM platformβ127Updated this week
- Efficient visual programming for AI language modelsβ353Updated 6 months ago