Said-Akbar / triton-gcn5Links

Triton for AMD MI25/50/60. Development repository for the Triton language and compiler

☆32

Alternatives and similar repositories for triton-gcn5

Users that are interested in triton-gcn5 are comparing it to the libraries listed below

Sorting:

Said-Akbar / vllm-rocm
FORK of VLLM for AMD MI25/50/60. A high-throughput and memory-efficient inference and serving engine for LLMs
☆65Updated 6 months ago
nlzy / vllm-gfx906
vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
☆327Updated last month
ROCm / TheRock
The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm
☆563Updated this week
aikitoria / open-gpu-kernel-modules
NVIDIA Linux open GPU with P2P support
☆78Updated 2 weeks ago
mixa3607 / ML-gfx906
ML software (llama.cpp, ComfyUI, vLLM) builds for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
☆53Updated last week
Nuullll / intel-extension-for-pytorch
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
☆47Updated 11 months ago
nlzy / triton-gfx906
triton for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60
☆37Updated last month
ikawrakow / ik_llama.cpp
llama.cpp fork with additional SOTA quants and improved performance
☆1,329Updated this week
turboderp-org / exllamav3
An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs
☆571Updated last week
gpustack / llama-box
LM inference server implementation based on *.cpp.
☆290Updated 3 months ago
lemonade-sdk / llamacpp-rocm
Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration
☆103Updated this week
Nexesenex / croco.cpp
Croco.Cpp is fork of KoboldCPP infering GGML/GGUF models on CPU/Cuda with KoboldAI's UI. It's powered partly by IK_LLama.cpp, and compati…
☆153Updated this week
Amblyopius / Stable-Diffusion-ONNX-FP16
Example code and documentation on how to get Stable Diffusion running with ONNX FP16 models on DirectML. Can run accelerated on all Direc…
☆300Updated 2 years ago
ssiu / flash-attention-turing
☆58Updated last month
Ai00-X / ai00_server
The all-in-one RWKV runtime box with embed, RAG, AI agents, and more.
☆587Updated 3 weeks ago
AIIRWKV / RWKV-RAG
RAG SYSTEM FOR RWKV
☆51Updated 11 months ago
ubergarm / r1-ktransformers-guide
run DeepSeek-R1 GGUFs on KTransformers
☆255Updated 8 months ago
Joluck / RWKV-PEFT
☆153Updated 3 weeks ago
KohakuBlueleaf / HakuRiver
A lightweight cluster manager that turns your small fleet of nodes into one powerful computer, using Docker for environment consistency w…
☆57Updated last month
scottt / rocm-TheRock
The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm
☆112Updated 2 weeks ago
arlo-phoenix / bitsandbytes-rocm-5.6
8-bit CUDA functions for PyTorch Rocm compatible
☆41Updated last year
william-murray1204 / stable-diffusion-cpp-python
stable-diffusion.cpp bindings for python
☆74Updated last week
Thireus / GGUF-Tool-Suite
Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowes…
☆63Updated this week
perk11 / large-model-proxy
Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…
☆83Updated 3 weeks ago
Orion-zhen / abliteration
Make abliterated models with transformers, easy and fast
☆92Updated 7 months ago
RWKV / rwkv-onnx
A converter and basic tester for rwkv onnx
☆43Updated last year
SystemPanic / vllm-windows
A high-throughput and memory-efficient inference and serving engine for LLMs (Windows build & kernels)
☆229Updated last month
SynthiaDL / TrainChatGalRWKV
☆41Updated last year
ROCm / vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
☆108Updated this week
likelovewant / ROCmLibs-for-gfx1103-AMD780M-APU
ROCm Library Files for gfx1103 and update with others arches based on AMD GPUs for use in Windows.
☆682Updated 2 months ago