elinx / ugradLinks
A C++ implementation of the scalar-valued autograd engine micrograd
☆23Updated 5 years ago
Alternatives and similar repositories for ugrad
Users that are interested in ugrad are comparing it to the libraries listed below
Sorting:
- Multi-Threaded FP32 Matrix Multiplication on x86 CPUs☆364Updated 5 months ago
- A faithful clone of Karpathy's llama2.c (one file inference, zero dependency) but fully functional with LLaMA 3 8B base and instruct mode…☆138Updated last year
- Neural Network framework using Backpropogation in C☆77Updated 3 years ago
- A C++ port of karpathy/llm.c features a tiny torch library while maintaining overall simplicity.☆36Updated last year
- Converting a deep neural network to integer-only inference in native C via uniform quantization and the fixed-point representation.☆25Updated 3 years ago
- A header only library implementing common mathematical functions using SIMD intrinsics☆113Updated last month
- A neural network implementation for the MNIST dataset, written in plain C☆98Updated 4 years ago
- Neural network from scratch in CUDA/C++☆86Updated last month
- A recurrent (LSTM) neural network in C☆95Updated 3 years ago
- Swin Transformer C++ Implementation☆63Updated 4 years ago
- Learn OpenCL step by step.☆136Updated 3 years ago
- GPUOcelot: A dynamic compilation framework for PTX☆210Updated 8 months ago
- My C++ deep learning framework & other machine learning algorithms☆88Updated 2 years ago
- Learn OpenMP examples step by step☆97Updated 8 months ago
- Pure C ONNX runtime with zero dependancies for embedded devices☆211Updated last year
- Teaching Vectorization and SIMD using Intel Intrinsics in a Computer Organization and Architecture class☆16Updated 7 months ago
- A c/c++ implementation of micrograd: a tiny autograd engine with neural net on top.☆71Updated 2 years ago
- A C++ port of karpathy/micrograd, a tiny scalar-valued autograd engine and a neural net library☆13Updated last year
- Algorithms implemented in CUDA + resources about GPGPU☆58Updated 3 years ago
- Lightweight C implementation of CNNs for Embedded Systems☆61Updated 2 years ago
- NVIDIA tools guide☆143Updated 9 months ago
- Some CUDA example code with READMEs.☆175Updated 7 months ago
- ☆112Updated 2 years ago
- MLIR based Tiny Graph Compiler [dev-stage]☆19Updated 10 months ago
- Isolating mlir tutorial dialect implementation☆25Updated 2 months ago
- A tree-walker && virtual-machine && JIT interpreter for Lox language☆30Updated last year
- Examples from the "C++ From Scratch" Series☆95Updated 2 years ago
- Super fast FP32 matrix multiplication on RDNA3☆75Updated 6 months ago
- Serial and parallel implementations of matrix multiplication☆44Updated 4 years ago
- A plugin for Jupyter Notebook to run CUDA C/C++ code☆246Updated last year