l4rz / running-nvidia-sxm-gpus-in-consumer-pcs
Running SXM2/SXM3/SXM4 NVidia data center GPUs in consumer PCs
☆79Updated last year
Alternatives and similar repositories for running-nvidia-sxm-gpus-in-consumer-pcs:
Users that are interested in running-nvidia-sxm-gpus-in-consumer-pcs are comparing it to the libraries listed below
- I've built a 4x V100 box for less than $5,500.☆128Updated 3 years ago
- Stable Difussion inference on Intel Arc dGPUs☆72Updated 10 months ago
- Linux based GDDR6/GDDR6X VRAM temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆89Updated 5 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆37Updated 4 months ago
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆35Updated 5 months ago
- 8-bit CUDA functions for PyTorch☆42Updated 2 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆88Updated this week
- ☆54Updated 7 months ago
- My develoopment fork of llama.cpp. For now working on RK3588 NPU and Tenstorrent backend☆80Updated last week
- A manual for helping using tesla p40 gpu☆114Updated 2 months ago
- Fast and memory-efficient exact attention☆152Updated this week
- 8-bit CUDA functions for PyTorch Rocm compatible☆39Updated 10 months ago
- Inference code for LLaMA models☆41Updated last year
- LLM Benchmark for Throughput via Ollama (Local LLMs)☆162Updated last week
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆115Updated 9 months ago
- A daemon that automatically manages the performance states of NVIDIA GPUs.☆51Updated 3 months ago
- Make PyTorch models at least run on APUs.☆46Updated last year
- Simple LLM inference server☆20Updated 7 months ago
- Docker image NVIDIA GH200 machines - optimized for vllm serving and hf trainer finetuning☆26Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆22Updated last week
- Port of Facebook's LLaMA model in C/C++☆20Updated last year
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆24Updated 11 months ago
- GPU Power and Performance Manager☆52Updated 3 months ago
- ☆100Updated last month
- Juice Community Version Public Release☆548Updated last year
- Generate Large Language Model text in a container.☆20Updated last year
- Editor with LLM generation tree exploration☆10Updated this week
- ☆40Updated last year
- Easily view and modify JSON datasets for large language models☆69Updated 3 months ago