vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
☆422Feb 20, 2026Updated 3 months ago
Alternatives and similar repositories for vllm-gfx906
Users that are interested in vllm-gfx906 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ML software (llama.cpp, ComfyUI, vLLM) builds for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆254May 30, 2026Updated 2 weeks ago
- Triton for AMD MI25/50/60. Development repository for the Triton language and compiler☆34Dec 15, 2025Updated 6 months ago
- FORK of VLLM for AMD MI25/50/60. A high-throughput and memory-efficient inference and serving engine for LLMs☆70May 4, 2025Updated last year
- ☆442Apr 4, 2025Updated last year
- Advanced interoperability middleware for GPGPU acceleration. Facilitates cross vendor hardware abstraction and API translation for parall…☆95May 23, 2026Updated 3 weeks ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill☆33Nov 7, 2025Updated 7 months ago
- ROCm Container 6.2 with PyTorch 2.4 for ComfyUI with RX570/RX580/RX590 aka Polaris AMD GPU Support☆12Feb 8, 2025Updated last year
- llama.cpp fork with additional SOTA quants and improved performance☆2,737Updated this week
- ☆19Aug 19, 2025Updated 10 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆121Updated this week
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆1,077Updated this week
- LM inference server implementation based on *.cpp.☆292Nov 24, 2025Updated 6 months ago
- The High Performance LLM Native Mock Server☆29May 24, 2026Updated 3 weeks ago
- Proxy for OpenAI☆16Sep 2, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Training Hierarchical Reasoning Models for next-token prediction.☆40Aug 7, 2025Updated 10 months ago
- Smart OpenAI‑compatible proxy for llama.cpp: manages slots, saves/restores KV cache to disk, routes requests by prefix similarity, and pr…☆46Nov 14, 2025Updated 7 months ago
- ML305A_ML307A_OpenCPU_Standard_1.4.2.2023062518_release☆23May 20, 2025Updated last year
- Extension for Forge-based UIs (Forge, reForge, etc) and ComfyUI to replace CFG with Negative Rejection Steering☆16May 16, 2026Updated last month
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆36Feb 12, 2025Updated last year
- The main repository for building Pascal-compatible versions of ML applications and libraries.☆206Aug 23, 2025Updated 9 months ago
- ☆34Jul 31, 2025Updated 10 months ago
- ☆33Apr 19, 2025Updated last year
- Agentic BYOK Browser-Based Website Builder☆46Updated this week
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 树莓派qwen-omni语音助手免TTS/STT☆17Apr 4, 2025Updated last year
- Fork of ollama for vulkan support☆22Apr 16, 2025Updated last year
- A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations☆17,282Jun 7, 2026Updated last week
- ☆83Feb 28, 2025Updated last year
- ☆51Oct 1, 2025Updated 8 months ago
- Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.☆30Jan 19, 2025Updated last year
- ☆45Aug 31, 2015Updated 10 years ago
- Get up and running with Llama 3, Mistral, Gemma, and other large language models.by adding more amd gpu support.☆1,764Updated this week
- Small 3D-printed Raspberry Pi NAS with support for up to 4 2.5" SSDs☆15Apr 22, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- "An optimizer custom node for ComfyUI that ensures each queue execution starts in an optimal state by clearing unused VRAM and unnecessar…☆20Jul 18, 2025Updated 11 months ago
- A skin smoothing filter to beautify faces.☆15Jan 18, 2021Updated 5 years ago
- ☆33Feb 10, 2025Updated last year
- A Streamlit app for generating high-quality Q&A training datasets from text and PDFs, leveraging Gemini, Claude, and OpenAI for LLM fine-…☆41Jul 5, 2025Updated 11 months ago
- fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tp…☆4,770Updated this week
- Profiling Google Gemma 3n Model Using PyTorch Profiler☆17Jul 7, 2025Updated 11 months ago
- A polyphonic music transcription Vamp plugin☆10Nov 20, 2019Updated 6 years ago