iacopPBK / llama.cpp-gfx906Links
llama.cpp optimized for AMD GFX906 (MI50/MI60/Vega7) GPUs
☆31Updated this week
Alternatives and similar repositories for llama.cpp-gfx906
Users that are interested in llama.cpp-gfx906 are comparing it to the libraries listed below
Sorting:
- High-Performance Text Deduplication Toolkit☆56Updated last month
- Lightweight Inference server for OpenVINO☆211Updated this week
- GPU Power and Performance Manager☆61Updated 11 months ago
- Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowes…☆45Updated this week
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆398Updated this week
- ☆62Updated last year
- ☆395Updated 5 months ago
- ☆143Updated 2 weeks ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆96Updated 3 weeks ago
- FORK of VLLM for AMD MI25/50/60. A high-throughput and memory-efficient inference and serving engine for LLMs☆64Updated 4 months ago
- InferX: Inference as a Service Platform☆135Updated this week
- llama.cpp fork with additional SOTA quants and improved performance☆1,198Updated this week
- ☆76Updated 2 weeks ago
- Onboarding documentation source for the AMD Ryzen™ AI Software Platform. The AMD Ryzen™ AI Software Platform enables developers to take…☆78Updated this week
- NVIDIA Linux open GPU with P2P support☆54Updated this week
- ☆468Updated this week
- ☆53Updated last year
- All-in-Storage Solution based on DiskANN for DRAM-free Approximate Nearest Neighbor Search☆73Updated 2 months ago
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆138Updated last year
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated last week
- AI Tensor Engine for ROCm☆279Updated this week
- A library and CLI utilities for managing performance states of NVIDIA GPUs.☆28Updated 11 months ago
- Ampere optimized llama.cpp☆24Updated last week
- Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.☆207Updated this week
- vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆257Updated this week
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆163Updated last year
- Simple node proxy for llama-server that enables MCP use☆13Updated 4 months ago
- Running SXM2/SXM3/SXM4 NVidia data center GPUs in consumer PCs☆125Updated 2 years ago
- FamilyBench evaluation tool for testing the relational reasoning capabilities of Large Language Models (LLMs).☆36Updated last month
- ☆85Updated last week