Said-Akbar / triton-gcn5Links
Triton for AMD MI25/50/60. Development repository for the Triton language and compiler
☆32Updated last month
Alternatives and similar repositories for triton-gcn5
Users that are interested in triton-gcn5 are comparing it to the libraries listed below
Sorting:
- FORK of VLLM for AMD MI25/50/60. A high-throughput and memory-efficient inference and serving engine for LLMs☆64Updated 5 months ago
- vLLM for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆307Updated 3 weeks ago
- A converter and basic tester for rwkv onnx☆42Updated last year
- NVIDIA Linux open GPU with P2P support☆66Updated 2 weeks ago
- ML software (llama.cpp, ComfyUI, vLLM) builds for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆34Updated last week
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆514Updated this week
- ROCm docker images with fixes/support for legecy architecture gfx803. eg.Radeon RX 590/RX 580/RX 570/RX 480☆76Updated 5 months ago
- A lightweight cluster manager that turns your small fleet of nodes into one powerful computer, using Docker for environment consistency w…☆55Updated 2 weeks ago
- A Python package for extending the official PyTorch that can easily obtain performance on Intel platform☆47Updated 10 months ago
- llama.cpp-gfx906☆45Updated last month
- A high-throughput and memory-efficient inference and serving engine for LLMs☆107Updated this week
- The HIP Environment and ROCm Kit - A lightweight open source build system for HIP and ROCm☆109Updated 2 weeks ago
- A simple webui for stable-diffusion.cpp☆47Updated last week
- automatically quant GGUF models☆214Updated last week
- Make abliterated models with transformers, easy and fast☆90Updated 6 months ago
- A Docker image based on rocm/pytorch with support for gfx803(Polaris 20-21 (XT/PRO/XL); RX580; RX570; RX560) and Python 3.8☆24Updated 2 years ago
- Fresh builds of llama.cpp with AMD ROCm™ 7 acceleration☆79Updated this week
- triton for AMD gfx906 GPUs, e.g. Radeon VII / MI50 / MI60☆32Updated last month
- Example code and documentation on how to get Stable Diffusion running with ONNX FP16 models on DirectML. Can run accelerated on all Direc…☆299Updated 2 years ago
- ☆150Updated this week
- A finetuning pipeline for instruct tuning Raven 14bn using QLORA 4bit and the Ditty finetuning library☆28Updated last year
- 8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs☆51Updated 2 years ago
- ROCm Library Files for gfx1103 and update with others arches based on AMD GPUs for use in Windows.☆660Updated last month
- This project is established for real-time training of the RWKV model.☆49Updated last year
- RAG SYSTEM FOR RWKV☆51Updated 10 months ago
- GPU Power and Performance Manager☆60Updated last year
- A torchless, c++ rwkv implementation using 8bit quantization, written in cuda/hip/vulkan for maximum compatibility and minimum dependenci…☆313Updated last year
- Stable Diffusion and Flux in pure C/C++☆21Updated this week
- Fast and memory-efficient exact attention☆194Updated last week
- ☆41Updated last year