anthonix / llm.cLinks
LLM training in simple, raw C/HIP for AMD GPUs
☆58Updated last year
Alternatives and similar repositories for llm.c
Users that are interested in llm.c are comparing it to the libraries listed below
Sorting:
- AI Tensor Engine for ROCm☆351Updated this week
- High-Performance FP32 GEMM on CUDA devices☆117Updated last year
- Fast and Furious AMD Kernels☆348Updated 2 weeks ago
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆93Updated this week
- Custom PTX Instruction Benchmark☆138Updated 11 months ago
- Samples of good AI generated CUDA kernels☆99Updated 8 months ago
- Learning about CUDA by writing PTX code.☆152Updated last year
- Super fast FP32 matrix multiplication on RDNA3☆82Updated 10 months ago
- ☆219Updated last year
- LLM training in simple, raw C/CUDA☆112Updated last year
- ☆95Updated this week
- Fast low-bit matmul kernels in Triton☆427Updated last week
- Official Problem Sets / Reference Kernels for the GPU MODE Leaderboard!☆201Updated this week
- Write a fast kernel and run it on Discord. See how you compare against the best!☆71Updated this week
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆387Updated 9 months ago
- PTX-Tutorial Written Purely By AIs (Deep Research of Openai and Claude 3.7)☆66Updated 10 months ago
- Nvidia Instruction Set Specification Generator☆311Updated last year
- Ahead of Time (AOT) Triton Math Library☆88Updated last week
- Inference RWKV v7 in pure C.☆44Updated 3 months ago
- Learn CUDA with PyTorch☆200Updated this week
- ☆286Updated this week
- 1.58 Bit LLM on Apple Silicon using MLX☆243Updated last year
- Fast and memory-efficient exact attention☆214Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆113Updated this week
- Quantized LLM training in pure CUDA/C++.☆238Updated 2 weeks ago
- Attention in SRAM on Tenstorrent Grayskull☆40Updated last year
- AMD related optimizations for transformer models☆97Updated 3 months ago
- Tenstorrent TT-BUDA Repository☆314Updated 10 months ago
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆251Updated 9 months ago
- Explore training for quantized models☆26Updated 6 months ago