lemonade-sdk / llamacpp-rocmLinks
Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration
☆129Updated this week
Alternatives and similar repositories for llamacpp-rocm
Users that are interested in llamacpp-rocm are comparing it to the libraries listed below
Sorting:
- Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.☆260Updated last week
- Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.☆488Updated last week
- ☆612Updated this week
- ☆154Updated last month
- Docs for GGUF quantization (unofficial)☆330Updated 4 months ago
- AMD (Radeon GPU) ROCm based setup for popular AI tools on Ubuntu 24.04.1☆216Updated last week
- ☆50Updated last month
- GPU Power and Performance Manager☆62Updated last year
- llama.cpp fork with additional SOTA quants and improved performance☆1,358Updated this week
- ☆195Updated 3 months ago
- Run LLM Agents on Ryzen AI PCs in Minutes☆792Updated this week
- vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆338Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆84Updated last week
- This is a cross-platform desktop application that allows you to chat with locally hosted LLMs and enjoy features like MCP support☆226Updated 3 months ago
- A web application that converts speech to speech 100% private☆81Updated 6 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆29Updated 6 months ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆100Updated last month
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆588Updated this week
- A platform to self-host AI on easy mode☆178Updated this week
- ☆228Updated 7 months ago
- Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowes…☆66Updated this week
- A persistent local memory for AI, LLMs, or Copilot in VS Code.☆175Updated last month
- ☆176Updated 3 months ago
- ☆87Updated 2 weeks ago
- ☆418Updated 8 months ago
- Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc☆1,977Updated last week
- ☆90Updated last week
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆29Updated 10 months ago
- No-code CLI designed for accelerating ONNX workflows☆219Updated 5 months ago
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backends☆48Updated 3 months ago