l4rz / running-nvidia-sxm-gpus-in-consumer-pcs
Running SXM2/SXM3/SXM4 NVidia data center GPUs in consumer PCs
☆64Updated last year
Related projects ⓘ
Alternatives and complementary repositories for running-nvidia-sxm-gpus-in-consumer-pcs
- I've built a 4x V100 box for less than $5,500.☆124Updated 2 years ago
- Stable Difussion inference on Intel Arc dGPUs☆68Updated 7 months ago
- ☆52Updated 4 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆37Updated last month
- A manual for helping using tesla p40 gpu☆102Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆111Updated 6 months ago
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆22Updated 2 months ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆35Updated 2 weeks ago
- Make PyTorch models at least run on APUs.☆44Updated 10 months ago
- Nvidia Instruction Set Specification Generator☆215Updated 4 months ago
- Using a Tesla P40 for Gaming with an Intel iGPU as Display Output on Windows 11 22H2☆27Updated last year
- Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator☆207Updated 11 months ago
- 8-bit CUDA functions for PyTorch Rocm compatible☆38Updated 7 months ago
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆67Updated this week
- Attention in SRAM on Tenstorrent Grayskull☆29Updated 3 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆89Updated this week
- Linux based GDDR6/GDDR6X VRAM temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆79Updated 2 months ago
- Reverse engineering the rk3588 npu☆63Updated 5 months ago
- Repository of model demos using TT-Buda☆55Updated last week
- GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.☆46Updated 9 months ago
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees" adapted for Llama models☆36Updated last year
- ☆134Updated this week
- ☆48Updated 4 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆44Updated last week
- Run stable-diffusion-webui with Radeon RX 580 8GB on Ubuntu 22.04.2 LTS☆57Updated last year
- Deep Learning Primitives and Mini-Framework for OpenCL☆174Updated 2 months ago
- SYCL implementation of Fused MLPs for Intel GPUs☆43Updated 2 weeks ago
- Train your own small bitnet model☆55Updated 3 weeks ago
- Ultra low overhead NVIDIA GPU telemetry plugin for telegraf with memory temperature readings.☆61Updated 4 months ago
- GroqFlow provides an automated tool flow for compiling machine learning and linear algebra workloads into Groq programs and executing tho…☆99Updated last week