olealgoritme / gddr6
Linux based GDDR6/GDDR6X VRAM temperature reader for NVIDIA RTX 3000/4000 series GPUs.
☆99Updated 2 weeks ago
Alternatives and similar repositories for gddr6
Users that are interested in gddr6 are comparing it to the libraries listed below
Sorting:
- Core, Junction, and VRAM temperature reader for Linux + GDDR6/GDDR6X GPUs☆40Updated 5 months ago
- 8-bit CUDA functions for PyTorch☆52Updated 2 weeks ago
- Simple monkeypatch to boost AMD Navi 3 GPUs☆39Updated 3 weeks ago
- build scripts for ROCm☆186Updated last year
- An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs☆351Updated this week
- 8-bit CUDA functions for PyTorch Rocm compatible☆40Updated last year
- Prometheus exporter for Linux based GDDR6/GDDR6X VRAM and GPU Core Hot spot temperature reader for NVIDIA RTX 3000/4000 series GPUs.☆19Updated 7 months ago
- ☆41Updated last year
- ☆313Updated last month
- NVIDIA Linux open GPU with P2P support☆22Updated 2 weeks ago
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆41Updated 8 months ago
- Gpu benchmark☆61Updated 3 months ago
- Fast and memory-efficient exact attention☆174Updated this week
- ☆68Updated 4 months ago
- ☆14Updated 5 months ago
- Make PyTorch models at least run on APUs.☆54Updated last year
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆152Updated 11 months ago
- LLM inference in C/C++☆76Updated this week
- Deep Learning Primitives and Mini-Framework for OpenCL☆195Updated 8 months ago
- Train Llama Loras Easily☆31Updated last year
- Running SXM2/SXM3/SXM4 NVidia data center GPUs in consumer PCs☆106Updated last year
- Efficient 3bit/4bit quantization of LLaMA models☆19Updated last year
- 8-bit CUDA functions for PyTorch, ported to HIP for use in AMD GPUs☆49Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- Framework agnostic python runtime for RWKV models☆146Updated last year
- Make abliterated models with transformers, easy and fast☆68Updated 3 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆439Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆71Updated 3 months ago
- Fast inference engine for Transformer models☆32Updated 6 months ago
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆67Updated 6 months ago