elinx / ugradLinks
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 5 years ago
Alternatives and similar repositories for ugrad
Users that are interested in ugrad are comparing it to the libraries listed below
Sorting:
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆35Updated last year
- LLM training in simple, raw C/CUDA☆103Updated last year
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆351Updated 3 months ago
- An Open Convolutional Neural Network Framework in C++ From Scratch☆66Updated 4 years ago
- MLIR based Tiny Graph Compiler [dev-stage]☆20Updated 8 months ago
- A recurrent (LSTM) neural network in C☆94Updated 3 years ago
- Learning about CUDA by writing PTX code.☆133Updated last year
- Neural network from scratch in CUDA/C++☆83Updated 6 months ago
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆25Updated 3 years ago
- Learn OpenCL step by step.☆138Updated 2 years ago
- CUDA Matrix Multiplication Optimization☆214Updated last year
- Serial and parallel implementations of matrix multiplication☆42Updated 4 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆111Updated last month
- IREE's PyTorch Frontend, based on Torch Dynamo.☆94Updated this week
- Pure C ONNX runtime with zero dependancies for embedded devices☆210Updated last year
- Simple neural network implementation using CUDA technology. It is an educational implementation.☆97Updated 7 years ago
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆69Updated last year
- My C++ deep learning framework & other machine learning algorithms☆88Updated 2 years ago
- Swin Transformer C++ Implementation☆63Updated 4 years ago
- Custom PTX Instruction Benchmark☆126Updated 5 months ago
- A simple and fast minimalistic header-only library allowing to run async tasks and execute task graphs.☆53Updated 8 months ago
- High-Performance SGEMM on CUDA devices☆98Updated 6 months ago
- Header-only safetensors loader and saver in C++☆65Updated 3 months ago
- GPUOcelot: A dynamic compilation framework for PTX☆207Updated 6 months ago
- A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources☆99Updated 2 years ago
- C++ demo of deep neural networks (MLP, CNN)☆34Updated last year
- ☆16Updated 4 years ago
- Neural Network framework using Backpropogation in C☆76Updated 3 years ago
- Repo for AI Compiler team. The intended purpose of this repo is for implementation of a PJRT device.☆19Updated this week
- Implementation of convolution layer in different flavors☆68Updated 7 years ago