aikitoria / open-gpu-kernel-modulesLinks
NVIDIA Linux open GPU with P2P support
☆60Updated last week
Alternatives and similar repositories for open-gpu-kernel-modules
Users that are interested in open-gpu-kernel-modules are comparing it to the libraries listed below
Sorting:
- ☆43Updated 2 weeks ago
- A pipeline parallel training script for LLMs.☆158Updated 5 months ago
- automatically quant GGUF models☆212Updated last week
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆42Updated last week
- LLM Inference on consumer devices☆124Updated 7 months ago
- InferX: Inference as a Service Platform☆136Updated last week
- DFloat11: Lossless LLM Compression for Efficient GPU Inference☆548Updated last month
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆81Updated this week
- Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowes…☆58Updated this week
- Samples of good AI generated CUDA kernels☆91Updated 4 months ago
- Easily view and modify JSON datasets for large language models☆83Updated 5 months ago
- Comparison of the output quality of quantization methods, using Llama 3, transformers, GGUF, EXL2.☆165Updated last year
- ☆102Updated last month
- Sparse Inferencing for transformer based LLMs☆201Updated 2 months ago
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆48Updated last year
- Distributed Inference for mlx LLm☆97Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 8 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆171Updated this week
- SLOP Detector and analyzer based on dictionary for shareGPT JSON and text☆76Updated 11 months ago
- ☆76Updated 9 months ago
- GPU Power and Performance Manager☆60Updated last year
- ☆152Updated 3 months ago
- ☆83Updated 2 weeks ago
- ☆17Updated 10 months ago
- ☆62Updated 3 months ago
- ☆135Updated 5 months ago
- Fast and memory-efficient exact attention☆193Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 8 months ago
- KoboldCpp Smart Launcher with GPU Layer and Tensor Override Tuning☆28Updated 5 months ago
- A simple GUI utility for gathering LIMA-like chat data.☆22Updated 2 weeks ago