mag- / gpu_benchmarkLinks
Gpu benchmark
☆67Updated 7 months ago
Alternatives and similar repositories for gpu_benchmark
Users that are interested in gpu_benchmark are comparing it to the libraries listed below
Sorting:
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆129Updated 8 months ago
- High-Performance SGEMM on CUDA devices☆97Updated 7 months ago
- 👷 Build compute kernels☆119Updated this week
- Load compute kernels from the Hub☆258Updated this week
- ☆85Updated last week
- ☆17Updated 8 months ago
- A collection of tricks and tools to speed up transformer models☆170Updated 2 months ago
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 5 months ago
- RWKV-7: Surpassing GPT☆94Updated 9 months ago
- ☆87Updated last year
- ☆54Updated 2 months ago
- train with kittens!☆62Updated 10 months ago
- [WIP] Better (FP8) attention for Hopper☆32Updated 6 months ago
- Experimental GPU language with meta-programming☆22Updated 11 months ago
- Samples of good AI generated CUDA kernels☆89Updated 3 months ago
- ☆61Updated 5 months ago
- ☆88Updated last year
- ring-attention experiments☆149Updated 10 months ago
- QuIP quantization☆57Updated last year
- Make triton easier☆47Updated last year
- LLM training in simple, raw C/CUDA☆104Updated last year
- ☆74Updated 8 months ago
- research impl of Native Sparse Attention (2502.11089)☆60Updated 6 months ago
- A safetensors extension to efficiently store sparse quantized tensors on disk☆153Updated this week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 6 months ago
- Inference of Mamba models in pure C☆191Updated last year
- Focused on fast experimentation and simplicity☆75Updated 8 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆46Updated last year
- The evaluation framework for training-free sparse attention in LLMs☆91Updated 2 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated 3 months ago