ReinForce-II / mmapeakLinks
☆38Updated 5 months ago
Alternatives and similar repositories for mmapeak
Users that are interested in mmapeak are comparing it to the libraries listed below
Sorting:
- ☆149Updated 2 months ago
- Gpu benchmark☆67Updated 7 months ago
- ☆17Updated 9 months ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 7 months ago
- NVIDIA Linux open GPU with P2P support☆31Updated 2 weeks ago
- DFloat11: Lossless LLM Compression for Efficient GPU Inference☆524Updated last week
- A safetensors extension to efficiently store sparse quantized tensors on disk☆153Updated this week
- Inference RWKV v7 in pure C.☆38Updated last week
- ☆74Updated 8 months ago
- Fast and memory-efficient exact attention☆183Updated 3 weeks ago
- High-Performance SGEMM on CUDA devices☆99Updated 7 months ago
- Fast low-bit matmul kernels in Triton☆356Updated last week
- LLM Inference on consumer devices☆124Updated 5 months ago
- Samples of good AI generated CUDA kernels☆89Updated 3 months ago
- Inference of Mamba models in pure C☆191Updated last year
- ☆217Updated 7 months ago
- LLM training in simple, raw C/HIP for AMD GPUs☆52Updated 11 months ago
- Python bindings for ggml☆146Updated last year
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated 10 months ago
- ☆85Updated last week
- llama.cpp to PyTorch Converter☆34Updated last year
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆277Updated last year
- An innovative library for efficient LLM inference via low-bit quantization☆348Updated last year
- FlashAttention (Metal Port)☆526Updated 11 months ago
- Learning about CUDA by writing PTX code.☆135Updated last year
- Experimental GPU language with meta-programming☆22Updated 11 months ago
- Advanced Quantization Algorithm for LLMs and VLMs, with support for CPU, Intel GPU, CUDA and HPU. Seamlessly integrated with Torchao, Tra…☆607Updated this week
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆374Updated 4 months ago
- Simple high-throughput inference library☆127Updated 3 months ago
- ☆554Updated 10 months ago