AnswerDotAI / gpu.cppLinks
A lightweight library for portable low-level GPU computation using WebGPU.
☆3,871Updated 3 months ago
Alternatives and similar repositories for gpu.cpp
Users that are interested in gpu.cpp are comparing it to the libraries listed below
Sorting:
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,459Updated last week
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆2,231Updated this week
- On-device AI across mobile, embedded and edge for PyTorch☆2,934Updated this week
- Implementation for MatMul-free LM.☆3,009Updated 7 months ago
- Tile primitives for speedy kernels☆2,438Updated last week
- Performance-portable, length-agnostic SIMD with runtime dispatch☆4,683Updated this week
- A minimal GPU design in Verilog to learn how GPUs work from the ground up☆8,420Updated 9 months ago
- Distributed LLM and StableDiffusion inference for mobile, desktop and server.☆2,856Updated 7 months ago
- Stable Diffusion and Flux in pure C/C++☆4,155Updated 3 months ago
- NVIDIA Linux open GPU with P2P support☆1,167Updated last week
- Run PyTorch LLMs locally on servers, desktop and mobile☆3,590Updated 3 weeks ago
- ☆1,261Updated 8 months ago
- Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception ha…☆1,785Updated 3 weeks ago
- Tensor library for machine learning☆12,673Updated this week
- A machine learning compiler for GPUs, CPUs, and ML accelerators☆3,243Updated this week
- CUDA Core Compute Libraries☆1,680Updated this week
- LLM training in simple, raw C/CUDA☆26,856Updated last month
- Blazingly fast LLM inference.☆5,699Updated this week
- CUDA Python: Performance meets Productivity☆2,746Updated this week
- JSON for Classic C++☆722Updated 6 months ago
- Schedule-Free Optimization in PyTorch☆2,174Updated 3 weeks ago
- [ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl☆2,302Updated last year
- The official PyTorch implementation of Google's Gemma models☆5,474Updated 2 weeks ago
- Deep learning at the speed of light.☆1,697Updated this week
- Inference Llama 2 in one file of pure C☆18,449Updated 10 months ago
- A PyTorch native platform for training generative AI models☆3,912Updated this week
- Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, an…☆1,397Updated this week
- An Extensible Deep Learning Library☆2,084Updated this week
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆1,833Updated 2 months ago
- Intermediate Graphics Library (IGL) is a cross-platform library that commands the GPU. It provides a single low-level cross-platform inte…☆3,078Updated this week