AnswerDotAI / gpu.cpp
A lightweight library for portable low-level GPU computation using WebGPU.
☆3,750Updated last week
Related projects ⓘ
Alternatives and complementary repositories for gpu.cpp
- Efficient Triton Kernels for LLM Training☆3,401Updated this week
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.☆4,178Updated last week
- PyTorch native quantization and sparsity for training and inference☆1,555Updated this week
- ☆1,000Updated 3 weeks ago
- Tile primitives for speedy kernels☆1,645Updated this week
- nanoGPT style version of Llama 3.1☆1,236Updated 3 months ago
- Blazingly fast LLM inference.☆4,418Updated this week
- NanoGPT (124M) quality in 7.8 8xH100-minutes☆965Updated this week
- CUDA Core Compute Libraries☆1,252Updated this week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,360Updated this week
- Stable Diffusion and Flux in pure C/C++☆3,472Updated 2 weeks ago
- Implementation for MatMul-free LM.☆2,918Updated last week
- Fast and accurate automatic speech recognition (ASR) for edge devices☆2,133Updated last week
- Official inference framework for 1-bit LLMs☆10,977Updated this week
- On-device AI across mobile, embedded and edge for PyTorch☆2,154Updated this week
- Solve puzzles. Learn CUDA.☆9,869Updated 2 months ago
- The n-gram Language Model☆1,337Updated 3 months ago
- UNet diffusion model in pure CUDA☆573Updated 4 months ago
- A native PyTorch Library for large model training☆2,586Updated last week
- ☆6,708Updated last week
- Distributed LLM and StableDiffusion inference for mobile, desktop and server.☆2,610Updated 3 weeks ago
- A vector search SQLite extension that runs anywhere!☆4,164Updated this week
- ☆1,125Updated last month
- lightweight, standalone C++ inference engine for Google's Gemma models.☆5,987Updated this week
- The Tensor (or Array)☆408Updated 3 months ago
- High-resolution models for human tasks.☆4,472Updated 2 weeks ago
- High-efficiency floating-point neural network inference operators for mobile, server, and Web☆1,879Updated this week
- Material for gpu-mode lectures☆2,986Updated this week
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆7,070Updated 2 months ago
- Video+code lecture on building nanoGPT from scratch☆3,580Updated 3 months ago